ThinkingCog

Articles written by Parakh Singhal

Converting XML data into Relational Form using OPENXML

Key take away:

Sometimes there are requirements direct or tangential, which require us to de-serialize data from hierarchical XML format into relational form. There are two approaches to do that in SQL Server. One using the OPENXML function which relies on the native features available in SQL Server. The other approach uses the “nodes” method of the XQuery language (A query language used to search XML documents) which has been baked right into SQL Server’s T-SQL. In this post I will be covering the OPENXML approach. This post is a prelude to the forthcoming post on the topic of inserting multiple rows in SQL Server database table via XML.

Read on:

One of the realities of doing software development is handling various data formats. Most of the time, it can be cleanly done at the application level, but sometimes you may be required to handle that at the database level. One such requirement is operating on multiple rows worth of table data in a single call to the database. One of the approaches of doing so deals with supplying the data in XML form, and de-serializing it at the database level in relational form and then perform the desired operation on the data. This post deals with the preliminary step that is required before you do any of the CRUD operation on the data i.e. converting the XML data into relational form. In this post I will cover the OPENXML approach to flatten the XML data in relational rowset form.

OPENXML approach:

The OPENXML is a rowset provider just like a table or a view which allows access to XML data in relational form. OPENXML uses an in-memory representation of XML data to facilitate the relational form of data. The parsing of XML data and its push into system’s memory can achieved with the help of a system stored procedure sp_xml_preparedocument, which takes in the xml data in string format and returns a handle to the in-memory representation of the xml data. This handle, which is an integer, is then consumed by the OPENXML function and data can be queried from there onwards. One important point to note here is that since the XML data is parsed into memory, it becomes the responsibility of the developer to free up the memory after running the desired operation on the parsed XML data. This is achieved with the help of sp_xml_removedocument system stored procedure. Thus the pseudo-code for entire operation would look like:

1. Parse the xml document into memory by sp_xml_preparedocument.

2. Run the desired data operation using OPENXML, providing it the handle to the in-memory xml data returned by sp_xml_preparedocument.

3. Clean up the system memory by running sp_xml_removedocument, providing it the handle to the data that needs to be removed (provided by sp_xml_preparedocument earlier).

Consider the following code:

 

DECLARE @XML xml = 
'<Students>
    <Student id="1">
        <FName>Parakh</FName>
        <LName>Singhal</LName>
        <Age>30</Age>
        <Courses>
            <Course id="1">Fundamentals of Databases</Course>
            <Course id="10">Fundamentals of Networking</Course>
            <Course id="15">Fundamentals of Security</Course>
        </Courses>
    </Student>
    <Student id="2">
        <FName>Glen</FName>
        <LName>Bennet</LName>
        <Age>30</Age>
        <Courses>
            <Course id="12">Fundamentals of Data Warehousing</Course>
            <Course id="15">Fundamentals of Security</Course>
        </Courses>
    </Student>    
</Students>';
 
DECLARE @docpointer int;
 
EXEC sp_XML_preparedocument @docpointer OUTPUT, @XML;
 
SELECT
StudentID,
StudentFirstName,
StudentLastName,
StudentAge,
EnrolledCourse1,
EnrolledCourse2,
EnrolledCourse3
FROM OPENXML(@docpointer,'/Students/Student',2)
WITH
(StudentID int '@id', 
StudentFirstName varchar(50) 'FName', 
StudentLastName varchar(50) 'LName',
StudentAge int 'Age',
EnrolledCourse1 varchar(50) '(Courses/Course)[1]',
EnrolledCourse2 varchar(50) '(Courses/Course)[2]',
EnrolledCourse3 varchar(50) '(Courses/Course)[3]');
 
EXEC sp_xml_removedocument @docpointer;

This gives us the following result:

SQL result

 

 

 

Explanation of code:

The sample XML data is a collection of students under the appropriately named root node “Students”. Each “Student” node further consists of information about the student and the courses that he’s enrolled in. The sample XML is sufficiently complex to give us an opportunity to learn the following;

a) How to query data available in the form of attribute of an element like “id” of a student.

b) How to query various node elements like “FName”,” LName” and “Age”.

c) How to query a hierarchy available in the form of “Course” information.

The code first declares an int type variable. This will be used to store handle to point to the in-memory XML data parsed with the help of the system stored procedure sp_xml_preparedocument. The data is then parsed with sp_xml_preparedocument.

The OPENXML function within the Select query is where all the action happens. The OPENXML function takes three input parameters into account – the handle to the in-memory XML data representation, the XPath expression that emits the XML to be parsed into relational rowset form, and a bit flag used to represent the type of mapping desired – attribute-centric, element-centric or a hybrid of both. I am using the element-centric mapping. For more information on the syntax of OPENXML and the associated bit flags, please visit TechNet site.

The associated With clause of the OPENXML describes the schema declaration that needs to be applied to the xml data in order to give it a desired rowset shape. Alternatively, name of a table that already exists in the database and represents the desired rowset schema, could be provided. We can opt to get all the data back from the parsed XML, or be selective about it. In the provided example I am parsing everything, sans the “id” attribute of the “Course” elements.

The schema declaration takes in three parameters – Name of the column as desired, a valid SQL Server data type mapped to corresponding value of attribute or element being queried and a valid XPath expression describing how the XML nodes should be mapped to the corresponding column. If you look closely then you will find that there’s a number appearing in square brackets in the portion of schema declaration that deals with “Course” elements. That’s the meta-property describing the relative position of the XML node in the hierarchy and mapping it to the desired column.

E.g. “EnrolledCourse2 varchar(50) '(Courses/Course)[2]' “signifies that the second “Course” element in the Courses hierarchy should be mapped with the EnrolledCourse2 column.

Once we have done the schema declaration, we use the same column names in the Select clause as described in the schema declaration to query the XML data. Once you get the correct results, you can insert the data, update the existing data or delete the data from the existing table in your database.

In my next post I will be covering on parsing XML data with the help of nodes function of the XQuery language available natively in T-SQL.

NOTE: I have deliberately not gone into the explanation of syntax of OPENXML function, as that can be perused from the official resources given in the references section.

References:

1. OPENXML (Transact-SQL) at TechNet

2. OPENXML at PerfectXML

Inserting Multiple Rows in SQL Server via a Comma Separated Values List

 

Key takeaway:

Frequently there are scenarios here you might be required to key in multiple rows into a SQL Server database table. In order to do that with minimum number of calls and a low memory footprint, there are several approaches available. I know of three, of which I am presenting the comma separated values (CSV) list approach.

Read on:

A few days ago, I ran into a requirement that required me to insert multiple rows of data into a table in a SQL Server database. There are a couple of approaches to do this and I am planning to do a series on this. Some of the approaches to do this are the following:

1. Passing a comma separated value (CSV) list to the database and parsing it at the database level with the help of a user defined function to split the list into rows and inserting those rows.

2. Passing the data as XML hierarchy,

3. Using Table Valued Functions, a feature that SQL Server supports version 2008 onwards.

All the three approaches are valid approaches and usage depends upon the kind of constraints that you are operating with. I was dealing with a legacy codebase, that did not afforded me the luxury of table valued parameters, which is the most straight forward and efficient way of accomplishing this, and had to settle for the CSV list and eventually the XML way.

In this post I will be covering the CSV list method. The comma separated values list method requires the following main ingredient: a user defined function that can be used to split text on the defined delimiter. Since this post is all about demonstrating the comma separated values list approach, I will not delve into the internal mechanics of such function. I just grabbed the first function that I could get off of the internet, and made it work.

I got the split function from the following url: SQL Server Forums - Best split function courtesy of the member Corey alias Seventhnight.

The technique works on the following strategy:

1. Parse the information that needs to be stored in the database in the form of a CSV list.

2. Pass it to the database packed in an appropriate type of parameter.

3. De-serialize the information into a table variable with the help of the aforementioned splitter function.

4. Depending upon the total number of columns in the target table, run a combination of insert and update commands, fetching the information from the table variable. A combination is made this way in order to minimize the disk activity, and complete the operation in as short duration as possible.

To demonstrate I will be using a very simple data model consisting of three tables: Student, Course and an association table supporting many-to-many relationship between Student and Course, StudentCourse. The Entity-Relationship diagram will clarify the relationship between the tables:

clip_image002

 

 

 

 

 

 

 

According the ER diagram, a student can take many courses and a course can be taken by many students. The insertion of courses that a student is interested in the association table is an ideal application of this technique.

The following ASP.NET webform that is the web front end that we will use to form a complete example to demonstrate this approach:

clip_image004

 

 

 

 

 

 

 

 

 

 

 

The coding for the webpage is very simple. The pseudo-code is as follows:

1. Select a student from the drop down.

2. Select from the available courses that the student needs enrollment for.

3. Click on the submit button.

I am going to leave it up to the reader to understand the programming in the web application. It is straight forward and the domain model powering the application is a reflection of the data model depicted above.

All the magic happens in the stored procedure powering the enrollment of a student in one of the listed courses. The stored procedure is as follows:

   1:  CREATE PROCEDURE [dbo].[EnrollStudentInCourses]
   2:   
   3:  @StudentID int,
   4:   
   5:  @Courses nvarchar(max)
   6:   
   7:  AS
   8:   
   9:  BEGIN
  10:   
  11:  Declare @TempStudentCourseTable Table
  12:   
  13:  (
  14:   
  15:  StudentID int null,
  16:   
  17:  CourseID int null
  18:   
  19:  );
  20:   
  21:  Insert into @TempStudentCourseTable (CourseID)
  22:   
  23:  Select CONVERT(int,Data) from dbo.Split(@Courses,',');
  24:   
  25:  Update @TempStudentCourseTable Set StudentID = @StudentID;
  26:   
  27:  Insert into dbo.StudentCourse (StudentID,CourseID)
  28:   
  29:  Select StudentID,CourseID from @TempStudentCourseTable;
  30:   
  31:  END
  32:   
  33:  GO
  34:   

 

The pseudo-code for the stored procedure is as follows:

1. There are two parameters in the stored procedure – The student ID of the student, who needs to be enrolled, and the courses passed as an nvarchar(max) parameter representing a CSV list of the courses in which the enrollment needs to be done.

2. The procedure uses a table variable that mimics the table structure of the StudentCourse table, sans the primary key.

3. Insert command is carried out on the table variable, with the CSV list comprised of the courses passed to the splitter function. Since out splitter function returns a table, with two columns – ID (int type identity column), and Data (nvarchar type), we can right away leverage the Data field, and insert it into the table variable by carrying out required conversion on the fly. Here I am converting the Data to integer, since CourseID is of type int.

4. Once the insertion has been done, I update the StudentID field in the table variable with the passed parameter value.

5. Once this is complete, I ran the Insert…Select… command to key in data into the final StudentCourse table. Depending upon the settings in the database, this will be a minimally logged operation, thereby giving you speed benefits.

Note that this is an overly simple example of the stored procedure, not employing transaction, error handling etc. and is meant to convey the concept of usage of comma separated values list in inserting multiple rows.

Emerging pattern:

If you look at this and carry it forward to more complicated use cases, then you’ll see a pattern emerging out. The pattern is:

1. Model the table variable after the final table that is supposed to receive the data.

2. Insert the most variable information in the table variable first.

3. Follow up the insertion with updation of table variable with information that is not changing much, like the StudentID in this example. It remains constant for all the courses inserted into the table variable.

4. If you need selective action within the table variable, make use of an integer type identity column. In this example, we did not require that.

5. The complexity of the procedure will depend upon the total columns in the final table. The more the columns, the more the number of update commands. There will also be a loop that you will have to manage in order to update the table variable with the right row form the table emitted by the splitter function.

In the next post I will cover how to insert multiple rows in SQL Server database using XML.

The sample code for this post consists of the web application and the powering database. Download it from:

Microsoft Reporting Services – The quirky uncle in the family

Key take away:

A Microsoft Reporting Services report needs a strongly typed dataset to work with, but it is generally overridden with an in-code dataset that is used to power the report. The report is sensitive to the name of the in-code dataset, and only works if provided with same name as what gets recorded in the XML innards of the report.

Read on:

Recently I got an opportunity to help a team shift their reports from Microsoft Reporting Services (MRS) to SQL Server reporting Services. Now I had left working with Microsoft Reporting Services back in 2010, when my team migrated all the reports to hosted SSRS environment, so I spent some time creating a simple yet effective proof of concept ASP.NET application working with a Microsoft Reporting Services report that I can show to the team in transition, on how import MRS reports in hosted SSRS environment. Nearly two-third of my time got spent in dealing with a quirk in Microsoft Reporting Services, which is providing the correct name of the dataset in the code, as the report wants it. I remembered this quirk from my days of working with Microsoft Reporting Services, but had forgotten the specifics of remedying the problem.

Microsoft Reporting Services is a port of SSRS that produces report(s) you can store in an ASP.NET application, along with the app contents and use the computing resources of the web application server to process data and serve the rendered view to the client. SSRS reports are bit more capable, easier to work with and can be offloaded to a dedicated report server, and thus computing resources (compute cycles, memory, bandwidth etc.) be saved on your primary web application server. Microsoft Reporting Services reports are a good choice when your reporting needs are simple, report generation is less frequent, and when on a shoe string budget. Otherwise, try to always opt for hosted SSRS environment.

clip_image001

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 1. Setup in Microsoft Reporting Services compared with hosted SQL Server Reporting Services (SSRS)

Microsoft Reporting Services requires a strongly typed dataset to work with, and process data from it. For SSRS the requirements are the same, but the process of making and dealing with a strongly typed dataset is easy and somewhat hidden. Both the types of reports use Report Definition Language (RDL), a format used to convey the report contents and related artifacts. RDL in turn is based on XML, which means that RDL is case sensitive and any change in the schema can potentially render your report unusable. SSRS reports inherit the extension of “.rdl” while Microsoft Reporting Services reports end with “.rdlc” extension, with “c” in “.rdlc” denoting client, i.e. processing at application server.

Now that we have understood a little background about the Microsoft Reporting Services, let’s understand the quirk.

Almost always, when developers deal with Microsoft Reporting Services reports, they make a report with a very simple strongly typed dataset; connect artifacts on the report with this simple dataset (tablix, matrix, charts etc.), but then override this dataset in their application code, containing data table(s) filled with data from the actual query that essentially produces the same columns, but different data qualitatively. I usually do this because strongly typed datasets cannot be de-bugged, and they are generally more time consuming to deal with than actual verbose code. Once the functionality to override the dataset is in place the strongly typed dataset can be removed entirely. But till the overriding facility is not in place, you have to retain the strongly typed dataset in place, otherwise the report won’t build.

Here I present a simple example consisting of three tables in a data model. The tables are self-explanatory in nature. I have filled the tables up with some junk data. I will use these to make a report in an ASP.NET Webforms website, and will show where one needs to be careful in producing code that over-rides the strongly typed dataset powering the report.

clip_image002

 

 

 

 

 

 

 

 

 

 

 

Figure 2 Data model

Use case: I might bind a tablix on an MRS report with a simple query of “Select FirstName, MiddleName, LastName, DateOfBirth from Person”. While the actual query that I want to run is “Select FirstName, MiddleName, LastName, DateOfBirth fromPerson inner join Address on Person.PersonID = Address.PersonID where Address.State = @State” to get me details of people living in a certain state. So the columns that need to go on report are the same, but the data that will now appear is different qualitatively.

clip_image004

Figure 3 Dataset name appears as “DataSet1” in MRS report powering my tablix.

The moment you insert a tablix, MRS is going to prompt to create a data source containing a dataset powering the tablix. Once you finish the operation, you will find a strongly typed dataset in the App_Code folder of your ASP.NET web application.

clip_image005

 

 

 

 

 

 

 

 

 

 

Figure 4 Dataset name as it appears in a strongly typed dataset.

You can double click the typed dataset and get the name of the strongly typed dataset by clicking in the blue section and bringing up the properties. The name came “DatabaseDataSet” came up by default. You can change that, but what matters is what is getting registered in the report.

clip_image007

 

 

 

 

 

Figure 5 The name of the dataset as it appears in XML that makes up the report is of most importance.

Now here’s the code that you have to produce very carefully.

Make sure that when you are passing the report data source object to the report, it bears the same name as that of the dataset bound to the report, and more appropriately as found in the XML of the report. It doesn’t matter what is the name of the strongly bound dataset.

To see the name as it appears in the report’s innards, open with the inbuilt XML editor or just open it in Notepad and look for the entry with the name.

clip_image009

Figure 6 The name of the dataset that you provide in a report data source should be the same as in report's innards.

This is quirk is only applicable on the name of the dataset that you need to provide. The name of the data source can be anything, and so is the name of the data table that carries the actual data in the dataset.

The sample code consists of two ASP.NET websites. One consists of a strongly typed dataset, while the other one does not consists of it. Both the websites have the same report being powered by an in-code dataset. The code was made in Visual Studio 2012 and targets .NET Framework 4.5. The code can be downloaded from:

DALHelper – A convenient way to connect to SQL Server in .net applications.

Visit DALHelper.Codeplex.com for source code, ready to use assembly and related documentation.

Brief description

The DALHelper is a project that was born out of my needs to connect to SQL Server database for my projects. I generally work on projects and design them using domain model and to keep them flexible, understandable and fast, I prefer to connect to the database using ADO.NET. Writing ADO.NET commands to connect to database can become monotonous and repetitive. Using Microsoft's Enterprise Library's Data Access Application Block is one solution, but I generally find it requiring a bit more effort than should be invested. Of course it gives you the freedom to change your supporting back end database. Generally back end databases in applications do not change, especially in enterprise environments and small applications for which the express version of SQL Server can be sufficient, and if you work with such applications then, I believe DALHelper can make your life a bit easier.

NOTE: This library cannot be used with any other database other than SQL Server, as it is built upon SQL Server specific ADO.NET commands.

The library contains various helper methods that can simplify the task of doing CRUD (Create, Read, Update and Delete) on a SQL Server database in a .NET application. The library can be used with any .NET compliant language, is open sourced and will remain that way.

DALHelper touches upon the following technologies:
1. Visual C# 4.0
2. .NET Framework 4.0
3. ADO.NET technology, part of .NET framework, used to connect to databases.
4. Code Contracts, Microsoft’s implementation of Design by Contact paradigm.

The DALHelper library has been designed to be provided with the connection string to the database once via property injection and then perform various operations as desired by the user with the help of the methods provided in the library. Some of the best practices that have been followed while the methods were designed are the following:

1. The methods follow a pattern in their signature and are easy to use.

2. The methods provided in the library are overloaded, thus providing you just the signature that you need in order to accomplish what you desire, without cluttering up your code.

3. The names of the methods are very much self-explanatory in context what they service they are supposed to provide.

The variety of methods available should fit majority of the requirements found in any small to medium scale enterprise application. Every method available in the library allows you to execute a SQL query or a stored procedure.


Every method available in the library has two overloads:

1. Without parameters, and

2. With parameters

Every method available in the library has the signature of the format:

1. SQL text: Is the SQL query or stored procedure that needs to be executed.

2. SQLTextType: An enum which signifies whether the SQL text parameter is a SQL query or a stored procedure. By default, the choice is that of a stored procedure.

3. List<SqlParameter>: An optional generic list of SQL parameters type that might be required by the SQL query or stored procedure. The parameters can be of both input and output type.

Brief description of methods available in the library:

1. ExecSQL


The ExecSQL method can be used to execute a SQL statement or a stored procedure without returning any value. Available in two overloads:
public void ExecSQL(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc)
public void ExecSQL(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc, List<SqlParameter> parameterCollection = null)

 

2. GetRowsAffected

 

The GetRowsAffected method can be used to execute a SQL query or a stored procedure and return the total number of rows affected by execution. Available in two overloads:
public int GetRowsAffected(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc)
public int GetRowsAffected(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc, List<SqlParameter> parameterCollection = null)

 

3. GetDataTable

 

The GetDataTable method can used to execute a SQL query or a stored procedure and return the result set in a data table. Available in two overloads:
public DataTable GetDataTable(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc)
public DataTable GetDataTable(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc, List<SqlParameter> parameterCollection = null)

 

4. GetDataView


The GetDataView method can be used to execute a SQL query or a stored procedure and return the result set as a data view. Available in two overloads:
public DataView GetDataView(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc)
public DataView GetDataView(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc, List<SqlParameter> parameterCollection = null)

 

5. GetDataSet


The GetDataSet method can be used to execute a SQL query or a stored procedure and return the result set(s) in a dataset. Available in two overloads:
public DataSet GetDataSet(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc)
public DataSet GetDataSet(string sqlText, SQLTextType sqlTextType = SQLTextType.Stored_Proc, List<SqlParameter> parameterCollection = null)

DALHelper is an excellent way to implement repository architecture style and simple CRUD implementations.

Future Roadmap

In future I plan, to extend this library into a framework, supporting various data sources such as XML, Excel, CSV files etc.

Helpful Links

Some of the links that might be helpful in understanding the programming concepts and technologies that have gone in making of this library:
Visual C#:

1. http://msdn.microsoft.com/en-us/vstudio/hh341490.aspx

.NET Framework:

1. http://www.microsoft.com/net

2. http://msdn.microsoft.com/en-US/vstudio/aa496123

ADO.NET:

1. http://msdn.microsoft.com/en-us/library/e80y5yhx.aspx

Code Contracts:

1. http://research.microsoft.com/en-us/projects/contracts

2. http://msdn.microsoft.com/en-us/devlabs/dd491992.aspx

Dependency Injection:

1. http://en.wikipedia.org/wiki/Dependency_injection

2. http://msdn.microsoft.com/en-us/magazine/cc163739.aspx

SignalR Sample Application 2 – Draggable Element

The last post showcased an elementary application, that was meant to prime up the reader on the basics of the SignalR library. This post builds upon the last post and makes a bit more involved web application,that uses jQuery UI library. The SignalR programming unto itself will remain simple. If you are new to SignalR, please first go through introduction to SignalR.

Intention:

Make a draggable visual element which when dragged across in a client browser, should replicate motion in other client browser(s).

To do this, we will take help of jQuery UI library which builds upon jQuery library. As I have already covered the basic steps involved in making a basic application in my last post, I will keep the instructions concise.

1. Make an empty ASP.NET Web forms website.

2. Add a SignalR Hub class with the name “MoveShape” and declare a public method “Move” which accepts two integer parameters x and y, which we will use to pass the screen coordinates. Here is what the code should look like:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using Microsoft.AspNet.SignalR.Hubs;
 
 
public class MoveShape : Hub
{
    public void Move(int x, int y)
    {
        Clients.Others.shapeMoved(x, y);
    }
}

Explanation of code:

The Move method will receive two integer values signifying the screen coordinates of the visual element on screen. The method will invoke method “shapeMoved” available on the client side and pass on the two coordinates, it received originally. These coordinates will then be used on the client side to position the visual element accordingly.

3. Add a Global.asax class and register hubs in the “Application_Start” method.

4. Add a css file with the following code. Note that the code is relatively simple and a lot of it is simply providing visual enhancement. It can be a lot simpler for this example.

body
{
    background: #261201;
    margin: 0;
    padding: 0;
    font-family: Sans-Serif;
    font-style: normal;
    font-size: 1em;
    font-weight: normal;
    font: Calibri, Verdana, Helvetica;
    color: #F2E7AE;
}
 
h1, h2, h3, h4, h5
{
    color: #A62103;
}
 
#container
{
    margin: 0.25em;
    padding: 0.5em;
}
#header
{
    margin-bottom: 0.25em;
    padding-left: 0.25em;
    border-bottom: 1px solid #EEE;
}
 
#draggable
{
    margin:1em;
    padding:1em;
    color: #A62103;
    background-color:#F2E7AE;
    width:100px;
    height: 50px;
    border: 3px solid #A62103;
    cursor:move;
}
 

Explanation of code:

The code having the id of draggable is the code that will make up the visual element in the form of a rectangle with a cream colored background.

5. Add an HTML file with the following code:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
    <title>Draggable Item</title>
    <link href="Style/Brown.css" rel="stylesheet" />
    <script src="Scripts/jquery-1.9.0.js"></script>
    <script src="Scripts/jquery.signalR-1.0.0-alpha1.min.js"></script>
    <script src="Scripts/jquery-ui-1.10.0.custom.min.js"></script>
    <script src="signalr/hubs"></script>
    <script type="text/javascript">
        $(document).ready(function () {
            var hub = $.connection.moveShape;
            var draggableShape = $('#draggable');
 
            hub.client.shapeMoved = function (x, y) {
                draggableShape.css({ left: x, top: y });
                draggableShape.html("Drag Me!!<br />x: " + x + "<br />y: " + y);
            };
 
            $.connection.hub.start().done(function () {
                draggableShape.draggable({
                    drag: function () {
                        var position = draggableShape.offset();
                        $(this).html("Drag Me!!<br />x: " + position.left + "<br />y: " + position.top);
                        hub.server.move(position.left, position.top);
                    }
                });
            });
        });
    </script>
</head>
<body>
    <div id="container">
        <div id="header">
            <h3>Making draggable item using jQuery.</h3>
        </div>
        <p>
            Drag the box around and SignalR synchronizes the motion in all open browser windows.
        </p>
        <div id="draggable">Drag Me!!</div>
    </div>
</body>
</html>

Explanation of code:

In the HTML code we have referenced all the required JavaScript files including the jQuery library, jQuery UI library and the SignalR JavaScript client. Notice that I am using the latest stable version of jQuery and jQuery UI library available at the time of writing this post. In the jQuery code, I am making the hub available at the client side, which will be used to program the methods that need to be executed at the client side. I am caching the visual element that will be made draggable, as reading the DOM repeatedly will lead to an efficient code. The method shapeMoved receives two integer parameters and uses those to change the css properties of x and y coordinates of the draggable elements in all the clients. Once the hub is instantiated, we make the div element bearing the id “draggable”, draggable with the help of the in-built function available in jQuery UI, and declare the drag property to record the x and y coordinates in the client view portal and execute the move function (which corresponds to the Move() method available at the server side code). The coordinates are also showed in the visual element just for referential purposes.

In the end the code hierarchy should look like in the following image:

code hierarchy 2

 

 

 

 

 

 

 

 

 

 

 

 

Now run the website in Debug mode and you will be able to observe a box which when moved in in client browser also moves in other client browser(s).

Browsers 2

 

 

 

 

 

 

 

 

 

 

 

This post was a bit heavy on the jQuery code, not so much on SignalR code, but shows the vast potential of the library. Anywhere, where you might require real time functionality between server and client(s), you can use SignalR.

Source code:

Draggable source code

 

 

Thanks for reading.

Relevant Links:

1. Building Real-time Web Apps with ASP.NET SignalR

2. www.asp.net/signalr

3. https://github.com/SignalR/SignalR

SignalR Sample Application – Hello World

In my last post, I described the simplicity that SignalR brings to table when we want to include real time asynchronous communication between server and client(s) in our application. In this post I will make a simple application, the proverbial ‘Hello World’ kind of application to introduce the basic concepts of this new library.

The following are the pre-requisites for the application:

1. Visual Studio 2012

2. .NET Framework 4.5

3. ASP.NET and Web Tools 2012.2

The best thing about the web tools update 2012.2 is the included template that makes adding SignalR capabilities to your project amazingly easy. Our website will be based on ASP.NET Web forms and will use JavaScript client, which comes in the form of a jQuery plugin. So let’s start.

1. Fire up your Visual Studio and make a new empty ASP.NET Web forms web site.

2. Right click on the website and add a SignalR Hub class and provide “HelloWorld” as the name for the hub class. This will add the following items in the website:

a) App_Code folder – which will house our hub class and the App_Start folder which contains the RegisterHubs.cs file, housing the code that maps the hubs available at application startup.

b) Scripts folder which houses all the JavaScript files, including the bundled jQuery library and the JavaScript client based upon the jQuery library.

c) Bin folder housing all the necessary assemblies that co-ordinate things on the server side. This also contains the JSON serializer Json.NET.

d) packages.config which lists all the external dependencies.

Now we will have to change this setup a little bit in order to make it work in a web forms website. Remove the App_Start folder and the containing RegisterHubs.cs file from it. The App_Start folder is something that the MVC website can use to execute code at application startup. Web form website uses Global.asax to do just the same. I believe this a template error, and will be corrected out in future releases.

Delete App_Start folder and it's contents

 

 

 

 

 

 

 

 

 

 

 

 

3. Add a Global.asax file to the solution. The HTML portion of your Global.asax file should look like this:

<%@ Application Language="C#" Inherits="Global" CodeBehind="~/App_Code/Global.asax.cs"%>
 
<script runat="server">
</script>

 

and the code behind portion should look like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.Routing;
using Microsoft.AspNet.SignalR;
 
/// <summary>
/// Summary description for Global
/// </summary>
public class Global:System.Web.HttpApplication
{
    void Application_Start(object sender, EventArgs e)
    {
        // Code that runs on application startup
        RouteTable.Routes.MapHubs();
    }
 

Explanation of code:

Note that we are required to include System.Web.Routing and Microsoft.Aspnet.SignalR namespaces and make the Global class inherit from System.Web.HttpApplication class.The purpose of having RouteTable.Routes.MapHubs is to map the hubs available to the client to a special default address: /signalr/hubs. If you want to configure the availability of hubs to a special address then you will have to provide custom address in the MapHubs as a string parameter. e.g. RouteTable.Routes.MapHubs(“~/signalr2”).

4. Now head over to the hub class and add the following code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using Microsoft.AspNet.SignalR.Hubs;
 
public class HelloWorld : Hub
{
    public void Hello()
    {
        Clients.All.hello();
    }        
}
 

 

Explanation of code:

The main thing to notice here is the inclusion of the namespace Microsoft.Aspnet.SignalR.Hubs and derivation of the hub class HelloWorld from parent class Hub. From the perspective of the client and server model, this is the class that aptly serves as a hub and will receive all the communication from the client(s) and further the same to other client(s) or take some action on the server side. In code, we are declaring that there is a server side method called “Hello” which when executed will execute a method called “hello”, available to all the clients.

5. Now add an HTML page and add the following code in it.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Hello World! with SignalR</title>
    <script src="Scripts/jquery-1.7.1.min.js"></script>
    <script src="Scripts/jquery.signalR-1.0.0-alpha1.min.js"></script>
    <script src="/signalr/hubs" type="text/javascript"></script>
    <script type="text/javascript">
        $(document).ready(function () {
 
            //defining hub which will have client and server side methods defined.
            var hub = $.connection.helloWorld;
 
            //defining what the client side "hello" method should do when called.
            //by the server side code.
            hub.client.hello = function () {
                $('#AddText').append('Hello World!<br />');
            }
 
            //starting the hub and specifying what should server side code should be
            //called when a certain client side event occur.
            $.connection.hub.start().done(function () {
                $('#SubmitButton').click(function () {
                    hub.server.hello();
                });
            });
        });
    </script>
</head>
<body>
    <div id="AddText"></div>
    <input type="button" id="SubmitButton" value="Say Hello" />
</body>
</html>

 

Explanation of code:

Here we reference all the necessary JavaScript libraries among which are the jQuery library and SignalR JavaScript client based on jQuery. There is one special JavaScript reference “/signalr/hubs” which is the JavaScript that is dynamically generated at client side and referenced by DOM at runtime. The body of the HTML document consists of a div element bearing id “AddText” to which we will add text “Hello World!” each time the button “Say Hello” is pressed. The jQuery code for this example is very simple. In it we declare the name of the hub class. Note that we are using camel casing as per JavaScript coding conventions to refer to our server side hub class. Thus “helloWorld” at client side is “HelloWorld” at server side. We then define that the client side “hello” method which is actually adding the text “Hello World!” and a line break to the div element “AddText”. We then start the connection and when done, observes the click event of the “Say Hello” button. When the click event happens on the client side, we execute the server side method “Hello” of the hub class, which in turns executes the “hello” method available at all the clients. This adds text to the div element at all clients.

Speaking diagrammatically, this is what we are aiming to do:

Client-Server Model

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In the end the code hierarchy should look like in the following image with all the code in the right place.

Code hierarchy

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6. As the last step start debug and run the website in IIS Express. Open the website in multiple different browsers and press the “Say Hello” button. It will add text “Hello World!” in all the browsers simultaneously in real time.

Multiple browsers

 

 

 

 

 

 

 

 

 

 

 

Website running in IE9 and Firefox, receiving instructions from server in real time.

The code example presented here has been deliberately kept simple to prime the reader of the basic concepts of this new library.

In a nut shell we achieved:

1. Real time communication between server and client(s).

2. Auto negotiation of protocol (WebSocket, forever frames, long polling etc.) between the client(s) and server.

3. Simple, unified programming interface built upon C# (server side) and jQuery (client side in ASP.NET).

Download code:

SignalR sample application download

 

 

 

In the next post I will demonstrate a more involved example, utilizing jQuery UI. Stay Tuned.

Introduction to SignalR

Key Takeaway:

SignalR is the newest member in the .NET ecosystem fully supported by Microsoft, offering an abstraction over the transport layer, that enables developers to make scalable applications that can have an real time asynchronous communication between server and client(s) and vice versa.

Read On

In the client server model, there are two modes of communication at play – push or pull. In pull paradigm, it is the client that pulls the information from the server. In push, it is the server that explicitly sends the information to the client. The web model is based on the pull technology, affectionately known as the request-response pattern. The client (browser) requests information from the server, and gets the response (web page, data etc.) Once this transaction is done the server merrily goes back into the state of amnesia. This pattern is the reason, why the web is stateless.

Whenever there is a requirement of data being explicitly pushed from the server, it poses a different set of problems than what the stateless nature of web is designed to solve. You then have to keep track of connections going to various clients all the while maintaining scalability and performance.

To achieve push from server this there are various kinds of hacks that are put in place, but all based on pull paradigm. The various sleights include opening a never ending connection to the server to polling the server again and again. There is even a standard term for such kinds of client-server interactions – COMET.

COMET operations fall under two kinds of categories –polling and streaming. Streaming is when the client opens up a connection to the server and keeps it open. Such kind of connection is called as a persistent connection and a characteristic of such a connection is the “Connection: keep-alive” command at the transport layer level. Persistent connection is predominantly used in web applications targeting real time data which needs to be server to a variety of versions of various browsers. One fine example is that of Google Finance. Notice that it uses persistent connection to bring in data continuously about the live condition of the financial market.

Persistent Connection 1

Google Finance uses persistent connection to update the webpage continuously.

Polling is the technique when the client keeps on sending requests to the server, demanding any new data that the server might have. A good example of polling is the Dropbox client application.

Polling 1

 

 

 

 

 

 

 

 

 

 

Dropbox client application in Windows 7 does polling.

Yet other trick involves using an iframe, which is fundamentally an HTML document embedded within a parent HTML document. This technique is primarily used to show ads dynamically within a web page and periodically pull a new one within the iframe.

iframe 1

Yahoo Mail’s new interface uses iframe to show advertisements.

The newest technology on the block is WebSocket protocol. WebSocket is part of HTML5 draft and is truly duplex and persistent form of connection to the server side processes. But sadly, not all the browsers and more importantly web servers support websocket protocol as of today. So if you are developing a public facing web site, you cannot truly rely on websocket yet.

So, all in all, right now we have a conundrum of technical tricks and an upcoming technology that is not uniformly supported by all client browsers and web servers, to solve the problem of serving real time data from server(s) to various client(s). Adding to the complexity is the fact that there are various kinds of clients available ranging from web browsers to sensors (that can use .NET Micro Framework) to native applications in iOS and Android ecosystems to desktop applications that require real time data pushed from a central server.

Meowpheus

 

 

 

 

 

 

 

 

 

 

 

This is where SignalR comes into play and provides a unified approach to handle the requirement of asynchronous real time duplex communication between client(s) and server.

The way SignalR works is that it auto negotiates the protocol between the client and the server based on the capabilities of the pair. So if they support websocket, then that is used. If not websocket, then SignalR falls back to server sent events, if not that then forever frames and so on. So the developer is not required to worry about the protocol detection and usage and eventual graceful degradation. SignalR handles this automatically, all this while providing a uniform API to program against. SignalR is being actively developed by Damian Edwards’s team at Microsoft and they have already released the first version in the fall update (2012) of ASP.NET. SignalR scales well and already supports SQL Server, Service Bus and key-value system like Redis. Client side API is available for .NET (4.0 and 4.5), HTML in the form of a jQuery client, WPF, Silverlight 5 and Windows Phone 8.

In the next post I will make a simple application using SignalR. In the meantime please learn more about this awesome technology from the following videos:

1. Damian Edwards and David Fowler presenting basics of SignalR (Sorry video is too big to fit in here.)

2. Keeping It Realtime Conference - SignalR - Realtime on ASP.NET (Scott Hanselman & Paul Batum)

 

 

Please head out to the following links to get more information:

1. SignalR.net

2. www.asp.net/signalr

3. Source code of SignalR at GitHub

Coming up in future

My last post, although not very utilitarian was an exiting one for me, as it allowed me to share my thinking. This post continues the theme of my last post. I wanted to share some of the choicest videos that helps us see a glimpse of what future holds for us. See how research conducted at avant-garde institutes might completely change our lifestyle, our thinking and give and new insights into solving problems ranging from transportation mental disorders.

In the first video Eric Schmidt (Chairman, Google) discusses about how ideas that had been in incubation for a long time are now coming to fruition. This specially ties into my previous post on the how proliferation of cloud computing and bandwidth are intertwined.

Eric Schmidt of Google talks at Princeton about the future of technology

 

I am big fan of innovation coming out of Google. One of them is the project of Google Glasses. It is worth noting how they have captured the facets of life where such a device might be of help. Although I doubt people will just put it on, the moment they wake up.

Project Glass: One day...

 

In the similar tone, here is Microsoft’s 3D Holodeck

Microsoft's HoloDesk - Direct 3D Interactions with a Situated See-Through Display

 

Sebastian Thurn’s Stanley was the winner of the DARPA Grand Challenge 2005. That challenge has inspired both tech industries and auto industry to create driverless cars. Listen to him give a beautiful introduction to the concept.

Sebastian Thrun: Google's driverless car

 

Simplifying complex concepts is an art unto itself and one that takes a lot of deliberate practice. Listening to Juan Enriquez discuss the direction that biotech will take and its implications on the life is mesmerizing.

Juan Enriquez: Using biology to rethink the energy challenge

 

Juan Enriquez: Will our kids be a different species?

 

Juan Enriquez: The next species of human

 

While advancement in technology mostly yields very tangible results in the form of automation, the time spent upon dealing with various problems can also yield simple solutions and a completely new insight. This video shows how VS Ramachandran found new insights on curing the phenomenon of phantom limb.

VS Ramachandran: 3 clues to understanding your brain

These are just some of the videos that help us what future beholds for us and understand the power of science and engineering, and why STEM courses are important for the future of growth of a nation. Just want to sign off this post with this video of Sir Ken Robinson, emphasizing the changes in the education system that needs to be bought to help catalyze and realize the creative thinking.

Sir Ken Robinson: Do schools kill creativity?

 

I hope this post made you as exited as I was when I got to revisit these videos.

Cloud computing and what it means for internet pipes

Key takeaway:

Technology companies in order to promote their cloud platform and integrate in our daily lives, might have to take ownership of infrastructure supporting the internet and in the process might also leverage this endeavor to become a service provider.

Read on:

In the not so distant future, we will be travelling in self-driving cars that would adjust the cabin climate according to the ambient climate and habits of the user, play music from your home server or read news/business reports using a digital subscription, and after dropping you off at the office will go and pick your groceries based off of the list sent to your car by your refrigerator. Once you reach home, you will be greeted with the ambience according to how your schedule went during the day. You will automatically be reminded by your home about your favorite TV serials scheduled for the day and will automatically record them for later playback, in case you are unable to see them when scheduled.

 

Microsoft’s vision of productivity in future

 

 

A Day Made of Glass... Made possible by Corning.

 

You see, that a lot of what I mentioned is already happening around us in or form or the other, it is just that it is not happening in one cohesive form, one mass that can act seamlessly. But that will change in future. A lot of what I mentioned in the opening paragraph requires the following things:

1. Data about you – living style, requirements, frequency and kind of purchases etc.

2. Appliances that operate standard protocols

3. Bandwidth

The big data and analytics movement aims to solve the first requirement and if you look at the concept of shopping recommendations, to a certain extent, it does solve it, and with time and data it will get better. The second requirement of appliances that operate on standard protocols is also being worked upon. The third requirement of bandwidth is the focus of this post and is the most interesting to me at this moment. Let me present my perspective on where things are and where they are heading.

It has been a long time since the term “Cloud Computing” was first coined, and it materialized in the form of Amazon Web Services in 2002. Cloud computing has since then evolved into various kinds of services, the gist of all being that, the consumer of the service is not required to engage in managing IT infrastructure. The service can be in the form of providing an operating system as a service (Windows Azure), which can then become a foundation to run programs written in various programming language(s) supported, or it can be simple Customer Relationship Management (CRM) system (Salesforce). All these are offered as services that are managed by their parent companies, freeing the service consumer to focus of what they do best, and not invest capital and human resources into managing infrastructure.

Now if we take a step back and look at how principles of economics governed the spread of general computing, we will find that general computing first targeted the enterprise space to make them efficient and save cost, and then came down to the general masses. Similarly, cloud computing is currently targeting the enterprise space heavily, enticing them with the convenience and upfront capital cost savings the concept brings with it, but has also started to crawl into the consumer space. Look at DropBox, Google Drive, Microsoft SkyDrive; they are not enterprise storage solutions, but rather consumer facing cloud storage solutions. In the entertainment section look at iTunes, Amazon AutoRip, Google Play and Netflix; they are not enterprise solutions at all, but rather a flavor of cloud services geared towards consumers.

An interesting side effect, of cloud computing is that it can be extended to individual consumers and be used to gather data about their purchasing habits. Applications and services like iTunes, Google Play, Amazon, Netflix etc. take into account your choice of music and based on your past preferences suggest you potential songs, movies, books, goods, services etc.

Since these services rely on internet as the medium of conveyance, the bandwidth available between the offering and the consuming entities can become an issue. This becomes a bigger concern if the service provider hosts a public facing service that is heavy on bandwidth such as video, for example Netflix.

In such situations, the growth of the service provider depends upon meeting the demands of the service consumer, and ensuring that there are reliable and redundant pipes available. Given this, it would bode well, if they have a say in the upkeep of these internet conduits. A more desirable situation from the service provider’s perspective would be to own this essential piece of infrastructure, if they want to grow their cloud platform.

As far as I can see and evaluate, I already see that happening, albeit at a smaller pace. Companies like Google, I believe already have a strategy and have started acting on it in the primary markets. We can see that in the form of high speed fiber connectivity in Kansas City, Missouri. Recently, Google started offering free Wi-Fi in limited areas in New York and in the City of Mountain View. Fast connectivity means more viewing of high definition video, more usage of cloud storage, more video conferencing leading to an alternate source of income for cloud computing provider(s) and ultimately an end to end solution i.e. collection of data about habits leading to predictive analytics leading to automated lifestyle. So it starts with the ownership of internet connectivity in city areas and eventually might end up with transcontinental internet pipes, and in the process becoming a service provider.

It gels well with the philosophy and selling points of cloud computing which are redundant data backups at geographically dispersed locations and content distribution networks which serve content from the data center which is located nearest to the consumer. One more pointer in this direction is the purchase of Motorola Mobility by Google, and now having access to a cache of intellectual property in the form of telecom patents.

One of the primary reasons of this trend is that if you look at history, then it will be apparent that the telecom companies have not done much to push the limits of bandwidth on a pro-active basis. They just offered to the consumers whatever lowest common denominator they could come up with that proved profitable to them. Before the advent of cloud computing, there was in fact no stake of the tech companies either. But all that is changing. More mobile devices, proliferation of video content, video communication, in-app purchases, cloud storage etc. require more and more bandwidth. That’s too much of a responsibility to be left with Ma Bell, especially when they are not getting a cut in the pie. Self-driving cars, home automation, intelligent thermostats, refrigerators etc., all eventually leading to an internet of things, would require a whole lot of bandwidth and redundancy. Retina screens, HD TVs, 4K TVs needs loads of bandwidth in order to render the depth and richness that they have been designed for. Thus, the future depends on how efficiently we carry data to and fro between the serving and consuming points, and that requires discarding off copper conduits and embracing optical and high speed wireless technologies with new standard protocols that work efficiently, precisely what pioneer companies like Google are quietly working on.

See more:

1. The future according to Google's Larry Page

2. The Internet of Things

3. Eight business technology trends to watch

Using DropBox to back up your Source Control repository

In organizations of every size, code is generally managed with the help of source control. It is great for keeping a versioned history and doing branched modifications. The code repository is generally kept in a regularly backed up environment, having someone caring for it with all their wit and skills. But the same level of service and peace of mind is not available to the same set of developers working on their weekend projects on their personal computers. Granted they can have the source control, but having a service regularly taking back-up of your repository can be hard; sometimes because of lack of time and/or storage media and at other times because backup is fragmented at different places and we do not have any idea of the whereabouts of the latest one. Cloud storage removes this obstacle and allows us to take care of automatic backups, in fact, instantaneous backups of the repository, whenever anything in repository is revised or added, without the hassle of handling any storage media. This has been made possible fundamentally by the fact that most of the cloud storage services provide a client application that can watch over a certain folder for any changed and/or new bits of information.

Here I will be covering on how to make DropBox’s Windows client application take care of a Tortoise SVN repository.

myImage

Figure 1 Conceptual diagram

Steps involved:

1. Install DropBox’s client application, appropriate for your operating system, with advanced parameters, specifying the name and location of the DropBox folder and choosing selective sync. If you have multiple accounts, make sure you provide the credentials of the account where you want a copy of the repository to be stored.

clip_image001

Figure 2 Advanced installation

 

clip_image002

Figure 3 Location of backed up folder

 

clip_image003

Figure 4 Using selective sync to better use your storage space

 

clip_image005

Figure 5 Installed directory

2. Install Tortoise SVN with default parameters.

3. Go to the folder that is being watched by the DropBox client service for any updates or additions and make a new folder in which you want the repository structure housed.

clip_image007

Figure 6 Create repository in DropBox folder

4. Right click on the folder and create a new repository there. Tortoise SVN will create a new repository there (there will be folders like conf, db etc.). You will be storing your code in this repository.

clip_image009

Figure 7 Create repository

 

clip_image011

Figure 8 Resultant repository structure

5. Once you create a new repository, DropBox application will immediately start synchronizing the backup folder with the one on its servers, replicating the entire repository structure.

6. Now navigate to the location to where you want to have your working copy of code. You have to create a working copy in which you will later be checking in your code and any new assets that you want added to your repository.

clip_image013

Figure 9 Check out from your repository

7. Finally you can add folders which can contain your projects. See the final result. (Create new folder –> Add to repository –> Commit to repository)

clip_image015

Figure 10 Example of directory structure of working copy

8. Once you do any updates to your repository from your working copy, only the bits that change or gets added will be uploaded to your DropBox profile.

It is important to understand that you have to make a backup of the repository structure itself (i.e. the weird structure containing folders such as conf, db, hooks etc.) and not the working copy, since you can always get the working copy from the repository.

Many thanks to my brother Priyanshu Agrawal for letting me use his computer for software installation and resultant screenshots.