ThinkingCog

Articles written by Parakh Singhal

Polly - An Introduction

Everything is getting smart. Not everyone, but everything.

My lamp, dishwasher, washing machine, car and everything in between, now connects to the internet and sends notifications about important events, from completing their duties to depleting levels of various consumables required in their functioning.

With all these devices connecting to a distributed and redundant network like the internet, which further hosts distributed systems that spans heterogeneous hardware, multiple software stacks and geographically separated data centers, something, somewhere, is bound to fail, albeit for a fraction of a second, but it will fail. The device may lose its WIFI signal momentarily, the router may experience a hiccup due to a power brownout, the optic network cable may experience the wrath of an excavator bucket or the server may die in the line of duty before the redundant one kicks in. All these are very much possibilities and we experience them all the time.

Such issues that are ephemeral in nature are known as transient errors in the programming domain. They are there momentarily and then they are not, and because of their non-persistent nature, are not easy to debug, as re-producing them may not always be feasible.

One of the proven ways to increase dependability is to increase availability of the desired product or service. Web services and applications run on a 24-hour schedule on servers that continuously consume power. One way to increase their availability is to increase the redundancy i.e. have the same service or application be available on multiple servers. The servers may all be in the same data center or geographically spread. The geographical spread helps in averting a situation where a single data center may experience a power outage, network outage or a natural calamity and take down the entire service or application.

But increasing availability always comes associated with costs. Every server/virtual machine/container cost to run and the more of them exists, the more man hours go in their upkeep and maintenance.

In order to reduce costs, without introducing redundancy beyond warranted, we incorporate smarts into software, so that the application keeps running the desired actions without throwing error, creating an illusion of high availability. Such activity which can introduce resilience in an application, can be achieved with the help of programming frameworks. One, and at the moment only framework to exists for .Net developers happen to be Polly.

With Polly, we can easily incorporate resiliency patterns such as retry pattern, circuit breaker, timeout, bulkhead isolation etc. in our applications and services. Each pattern deserves a post unto itself and in future I will bring more information on the patterns along with code examples.

Till then, please enjoy a video in which Late Mr. Scott Allen discusses about building resilient applications in cloud and look for opportunities to apply a resilient framework like Polly.

Pivoting in SQL Server

Introduction

The meaning of the word “Pivot” in the English language is “The central point, pin or shaft on which a mechanism turns or swivels.”. And that’s what the Pivot function in SQL Server does for you. It is used to convert rows to columns and in the process aggregate data.

Pivot function is used in OLAP queries to power reporting needs. Generally, you will find Pivot queries aggregating data in temporal reports with time-spanning in years or spread over months, or in reports that span a spectrum of an entity like sales regions or sales employees.

Some important notes about Pivot:

  1. Three elements form the foundation of a pivot query:
    1. The aggregating element, accepted by the pivot operator in conjunction with an aggregation function to aggregate the data. This will appear at the intersection of the rows and columns in the pivoted result-set
    2. The grouping element, which appears on rows in the pivoted result-set
    3. The spreading element, which appears as columns in the pivoted result-set
  2. Pivot requires an aggregator function along with a spreading element which would be used to spread data in the form of columns
  3. You have to know every value that would be used as a spreading element and appear as a column header
  4. Pivot uses data from a CTE (Common Table Expression) or a sub-query for processing

Let’s dive into a working example of the Pivot function.

In our example, we will create a simple table storing customer orders. Specifically, it will store customer ids, the value of the transactions they made, and the dates on which the transactions were done.

 

Create Database Temp;
Go
 
Use Temp;
Go
 
Create Table dbo.CustomerOrders
(
    Id int IDENTITY(1,1) NOT NULL,
    CustomerId int NULL,
    OrderId int NULL,
    OrderValue money NULL,
    DateOfOrder date NULL
);
Go
 
Alter Table dbo.CustomerOrders
Add Constraint PK_CustomerOrders_Id Primary Key (Id);
Go
 
Insert Into CustomerOrders 
(CustomerId, OrderId, OrderValue, DateOfOrder) 
Values
(1,210,300,'2021-01-01'),
(1,205,30,'2021-01-05'),
(1,220,200,'2021-02-01'),
(1,230,900,'2021-03-01'),
(1,240,1300,'2021-04-01'),
(1,245,6700,'2021-04-10'),
(1,250,3000,'2021-05-01'),
(1,260,100,'2021-06-01'),
(1,270,700,'2021-07-01'),
(1,275,1700,'2021-07-21'),
(1,280,1200,'2021-08-01'),
(1,290,350,'2021-09-01'),
(1,290,500,'2021-10-01'),
(1,290,600,'2021-11-01'),
(1,290,850,'2021-12-01'),
(1,295,85,'2021-12-25'),
(2,300,850,'2021-01-03'),
(2,310,85,'2021-02-25'),
(2,320,90,'2021-03-15'),
(3,330,850,'2021-02-25'),
(3,335,300,'2021-02-15'),
(3,340,80,'2021-03-29'),
(3,350,450,'2021-04-13'),
(3,360,850,'2021-05-25'),
(3,370,8500,'2021-06-23'),
(3,380,67,'2021-08-16'),
(3,390,550,'2021-12-25'),
(3,395,578,'2021-12-20'),
(4,400,850,'2021-02-25'),
(4,410,85,'2021-03-29'),
(4,420,50,'2021-05-10'),
(4,430,400,'2021-07-30'),
(4,440,900,'2021-08-19'),
(4,440,450,'2021-09-23'),
(4,450,185,'2021-10-30'),
(4,455,250,'2021-11-12'),
(4,460,25,'2021-11-29'),
(4,470,670,'2021-12-29');
Go

 

Pivoting

With the table created and data keyed in, we will now run a pivot query which will help us aggregate the data in a manner that will enable us to see the month-wise spending by each customer in a given year. If you run the CTE code in isolation then you will observe that the query gives you the result with the customer id, order value, month, and year in which the transaction was done in a columnar fashion. With the help of the pivot function, we will spread the data across months and in the process of doing so, aggregate the transactional value. So, customer id and year of transaction form our grouping elements, the month of transaction form our spreading element, and the order-value form or aggregating element.

With RawData As
(
    Select    
    CustomerId,
    OrderValue,
    Year(DateofOrder) as OrderYear,
    Month(DateOfOrder) as OrderMonth
    From CustomerOrders
)
 
Select * From RawData
Pivot(Sum(OrderValue) For OrderMonth 
in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) 
as PivotData
Order By CustomerId Asc, OrderYear Asc;
Go

 

The result of the query when run would be:

Fig. 01 Pivoting Result

Fig. 01 Pivoting result

Now understand a couple of points here:

  1. We can replace the CTE with a subquery. Though, I would recommend a CTE for better clarity in your entire SQL structure.
  2. We can use all the columns returned from the CTE in our select clause of the pivot query, except for the column that needs to be aggregated. That means the "Select" clause can use the grouping and spreading elements, barring the aggregating element.
  3. We have to know the values of the spreading element. The spreading element itself is used in the “For” clause of the pivot query, and its distinct values are used in the “In” clause.
  4. We can use the pivoted columns in the “Select” clause of our pivot query, but only if we know their values and use them as column headers. Generally, to prevent the hassle, developers use the wild-card character to select all the values automatically.

The same query mentioned above can also be run as

With RawData As
(
    Select    
    CustomerId,
    OrderValue,
    Year(DateofOrder) as OrderYear,
    Month(DateOfOrder) as OrderMonth
    From CustomerOrders
)
 
Select 
CustomerId,
OrderYear,
[1] as January,
[2] as February,
[3] as March,
[4] as April,
[5] as May,
[6] as June,
[7] as July,
[8] as August,
[9] as September,
[10] as October,
[11] as November,
[12] as December
From RawData
Pivot(Sum(OrderValue) For OrderMonth 
in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) as PivotData
Order By CustomerId, OrderYear;
Go

 

Now that we have mentioned the spreading elements with our choice of column headers, the output of the query would look like:

Fig. 02 Pivoting Result with Specific Spreading Elements

Fig. 02 Pivoting Result with Specific Spreading Elements

Here’s the general format from Microsoft that you can use to run a pivot query on your data.

SELECT <non-pivoted column>,  
    [first pivoted column] AS <column name>,  
    [second pivoted column] AS <column name>,  
    ...  
    [last pivoted column] AS <column name>  
FROM  
    (<SELECT query that produces the data>)   
    AS <alias for the source query>  
PIVOT  
(  
    <aggregation function>(<column being aggregated>)  
FOR   
[<column that contains the values that will become column headers>]   
    IN ( [first pivoted column], [second pivoted column],  
    ... [last pivoted column])  
) AS <alias for the pivot table>  
<optional ORDER BY clause>;  

 

Doing Pivoting with Dynamic SQL

The biggest challenge that you will come across when using the pivot function in SQL Server will be that many times you may come across new entrants in the spreading elements that will force you to upgrade the SQL query. An example of that is, say, a new geographical region came up that needs to be accounted for in the spreading element, or the spreading element is required to include the last 5 years and it will require an update every year as time goes by. To deal with this issue, we have to take advantage of dynamic SQL.

We will leverage a function available in SQL Server to aid in such scenarios. Note that the values of the spreading element are required to be delimited in square brackets, so if there’s any white space between the characters constituting the values of the spreading elements, it can be taken into account.

The way such a query will work would be:

1. Extract all the distinct values to be included in the spreading element and concatenate them into a SQL parameter.

2. While concatenating, use the Quotename function to encapsulate all the distinct spreading element values into square brackets used as delimiters, separated by commas.

3. Eliminate the last trailing comma from the SQL parameter with all the values for the spreading element.

4. Capture the SQL to be run into a parameter with a spreading element captured appropriately.

Example:

-- Declare the SQL parameters to capture
-- values for the spreading element and sql query
Declare 
@SpreadingElement nvarchar(max) = '',
@SQL nvarchar(max) = '';
 
-- Capture distinct spreading element values
With MonthData As
(Select distinct Month(DateofOrder) As Months From CustomerOrders)
 
Select @SpreadingElement += QUOTENAME(Months,'[]') + ',' From MonthData
 
-- Eliminating the trailing comma from the spreading element values
Set @SpreadingElement = LEFT(@SpreadingElement, Len(@SpreadingElement) - 1);
 
-- Dynamic SQL
Set @SQL = 
'With RawData As
(
    Select    
    CustomerId,
    OrderValue,
    Year(DateofOrder) as OrderYear,
    Month(DateOfOrder) as OrderMonth
    From CustomerOrders
)
 
Select 
* From RawData
Pivot(Sum(OrderValue) For OrderMonth 
in (' + @SpreadingElement + ')) as PivotData
Order By CustomerId Asc, OrderYear Asc;'
 
Execute sp_executesql @SQL;

 

The query gives the same result as given by the previous queries.

Fig. 03 Pivoting Result Dynamic SQLFig. 03 Pivoting Result Dynamic SQL

Hope this was useful.

References:

1. T-SQL Window Functions: For data analysis and beyond (Developer Reference) by Itzik Ben-Gan

2. Common Table Expressions in SQL Server

3. Window Functions in SQL Server

4. Pivot, Multi Pivot and Dynamic Pivot in SQL Server – SQLChitChat.com

How to Generate a Sequence of Date and Time Values in SQL

There are a set of problems that require the usage of a series of sequential date and time values. One such problem is to generate reports comprising of sales or events that would capture a sequence of continuous date and time values irrespective of whether a sale or an occurrence of an event happened on a certain date or a time value. Sequential date and time value are also required in applications like schedulers to execute the desired program at a pre-defined date or time value.

There are various ways to generate a sequence of date and time values, but of all that I have come across, the one that leverages the Row Number window function is the most performant and leverages the knowledge of generating sequential numbers that I have already covered in my previous post.

The over-arching concept upon which the generation of date and time values will depend will be the generation of sequential numbers. The pseudo-code will be as follows:
1. Generate sequential numbers between the lower and the upper bound as given. The two bound figures will be the date or time values.
2. Use the DateAdd function to add the generated sequence in step no 1 to lower bound in the unit of time as desired. The unit of time can be minute, hour, day, etc.

First, create a function that can get us a range of sequential numbers between the lower and upper bound:

-- Function to return a table of numbers between lowerbound and upperbound
Create Function dbo.GetNumbers(@LowerBound As Bigint, @UpperBound As Bigint) 
Returns Table
As 
Return
 
-- By default, the CTE will always generate numbers starting with 1 
-- going all the way upto 42,99,67,296.
-- SQL Engine is smart and will only generate numbers up until 
-- the upperbound when defined
With 
-- 2^1 = 2 values
Level0 As (Select 1 as SeedValue1 Union All Select 1 As SeedValue2), 
-- 2^2 = 4 values
Level1 As (Select 1 As GeneratedValues From Level0 As A 
    Cross Join Level0 As B), 
-- 2^4 = 16 values
Level2 As (Select 1 As GeneratedValues From Level1 As A 
    Cross Join Level1 As B), 
-- 2^8 = 256 values
Level3 As (Select 1 As GeneratedValues From Level2 As A 
    Cross Join Level2 As B), 
-- 2^16 = 65,536 values
Level4 As (Select 1 As GeneratedValues From Level3 As A 
    Cross Join Level3 As B), 
-- 2^32 = 42,99,67,296 values
Level5 As (Select 1 As GeneratedValues From Level4 As A 
    Cross Join Level4 As B), 
-- Obtain unique integers
Numbers As (Select Row_Number() Over(Order By GeneratedValues) As 
GeneratedValues From Level5) 

 

Then depending upon the unit of time desired (day, hour, minute etc.), use the following query to generate a sequence of date and time values:

/*
Example: To generate consecutive dates between two given dates, 
  we will first generate the total gap in terms of the 
  number of days (sequence of numbers) between the given 
  bounds and then use that sequence and add the days to 
  the lower bound to reach the upper bound in a continuous manner.
*/
Declare 
@Start as Date = '20210101',
@End as Date = '20210131'
 
Select Dateadd(day, GeneratedNumbers, @Start) as Dates
From dbo.GetNumbers(0, DateDiff(day, @Start, @End)) 

 


01 Generation of Sequential Dates

Figure 1: Generation of sequential dates between a lower and upper bounds

Here’s a query that generates hour values:

/*
Example: To generate consecutive 12-hour units between the provided bounds, 
  we will first generate a sequence consisting of the total number of such 
  units between the provided bounds. Then add units progressively 
  to the lower bound to reach the upper bound in a continuous manner.
*/
Declare 
@Start as Datetime2 = '2021-01-01 00:00:00',
@End as Datetime2 = '2021-01-31 00:00:00'
 
Select Dateadd(hour, GeneratedNumbers*12, @start) As TimeSlots
from dbo.GetNumbers(0, datediff(hour, @Start, @End)/12) 

02 Generation of Sequential 12 Hour Units

Figure 2 Generation of sequential units of 12 hours each between a lower and upper bounds

Hope this article was useful.

References:

1. T-SQL Window Functions: For data analysis and beyond (Developer Reference) by Itzik Ben-Gan
2. Common Table Expressions in SQL Server
3. Window Functions in SQL Server
4. Row_Number Window Function in SQL Server

How to Create a Virtual Numbers Table in SQL

A number of problems exist out in the world of SQL, which requires a range of sequential numbers available on demand. This can be achieved either by generating the numbers on demand virtually or by storing them in a table and using them as and when required. In this blog post, I will shed light on how to go about generating such a sequence with the help of a window function and then refine it to generate a range of sequential numbers within the specified bounds.

Short overview of window functions

Window functions are part of the ANSI SQL standard and are fundamentally analytical functions that operate on a section of data, called partition or window. Partition or window can be a subset of the entire result set or the entire result set itself as defined by the SQL query.

Row_Number is a window function that falls in the category of ranking functions and generates a sequence number for every row within the window as defined in the SQL query. An example is going to make it clear:

Create Table dbo.CustomerOrders
(
   Id                   int  Identity(1,1)  Primary Key,
   CustomerId           int,
   OrderId              int,
   OrderValue           money,
   DateOfOrder          Date
);
Go
 
Insert Into dbo.CustomerOrders 
(CustomerId, OrderId, OrderValue, DateOfOrder) 
Values
(1, 200, 300, GETDATE()-1),
(1, 201, 450, GETDATE()+1),
(2, 202, 500, GETDATE()-2),
(3, 203, 600, GETDATE()+3),
(4, 204, 800, GETDATE()+3),
(2, 205, 650, GETDATE()-3),
(1, 206, 600, GETDATE()+3);
Go
 
Select Id, CustomerId, OrderId, OrderValue, DateOfOrder,
Row_Number() Over(Partition By CustomerId Order By OrderValue) 
as RowNumber
From dbo.CustomerOrders


When you run the example mentioned in the SQL query it will produce a result that looks like:

Figure 1 Functioning of Row Number function

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 1 Functioning of Row Number function

As you can see in the example, the Row_Number function worked on the window or partition created based on the customer id and has numbered the rows sequentially. There are three orders placed by the customer with the customer id of 1, two orders placed by the customer with customer id of 2, and one order each placed by customers with customer id of 3 and 4 correspondingly. The rows have been ordered for each customer based on the value of the order in an ascending manner.

I will not delve into the functioning of the window functions; different clauses associated with them or explain the Row_Number function here. They deserve an entire blog post unto themselves. For more information on window functions and Row_Number function, please refer to the information provided in the references section at the end of this article.

Creating a Virtual Number Table in SQL

There are problems like generating a contiguous range of dates, identifying gaps (Islands problem), splitting strings that use a contiguous set of numbers, and in such problems, we would require a table with a sequence of numbers. The approach discussed in this article hinges on the understanding of the Common Table Expression (CTE) and the Row_Number window function.

The overarching approach of this technique is to generate a sequence of 1s with the help of a series of CTEs and then use the Row_Number window function that would ride the sequence of 1s to generate a sequence of ascending integer numbers.

In order to generate the seed values, consider the following CTE:

Level0 As (Select 1 as SeedValue1 Union All Select 1 As SeedValue2)

When you run the CTE, it generates the following output:

Figure 2 Seed values as emitted by the seed CTE

 

 

 

 

 

 

 

 

 

 

 

Figure 2 Seed values as emitted by the seed CTE

We then proceed to create a series of CTEs which by virtue of cross join between consecutive CTEs create an exponential sequence of 1s.

With 
 
-- 2^1 = 2 values
Level0 As (Select 1 as SeedValue1 Union All Select 1 As SeedValue2), 
 
-- 2^2 = 4 values

Level1 As (Select 1 As GeneratedValues From Level0 As A

Cross Join Level0 As B),

 
-- 2^4 = 16 values
Level2 As (Select 1 As GeneratedValues From Level1 As A 
Cross Join Level1 As B), 
 
-- 2^8 = 256 values

Level3 As (Select 1 As GeneratedValues From Level2 As A

Cross Join Level2 As B),

 
-- 2^16 = 65,536 values
Level4 As (Select 1 As GeneratedValues From Level3 As A 
Cross Join Level3 As B), 

 

We then generate unique sequential numbers with the help of Row_Number window function working over the complete result set as its window. The full SQL query that would be able to generate a total of 42,99,67,296 values, would look something like:

With 
 
-- 2^1 = 2 values
Level0 As (Select 1 as SeedValue1 Union All Select 1 As SeedValue2), 
 
-- 2^2 = 4 values

Level1 As (Select 1 As GeneratedValues From Level0 As A

Cross Join Level0 As B),

 
-- 2^4 = 16 values
Level2 As (Select 1 As GeneratedValues From Level1 As A 
Cross Join Level1 As B), 
 
-- 2^8 = 256 values

Level3 As (Select 1 As GeneratedValues From Level2 As A

Cross Join Level2 As B),

 
-- 2^16 = 65,536 values
Level4 As (Select 1 As GeneratedValues From Level3 As A 
Cross Join Level3 As B), 
 
-- 2^32 = 42,99,67,296 values

Level5 As (Select 1 As GeneratedValues From Level4 As A C

ross Join Level4 As B),

 
-- Obtain unique integers
Numbers As (Select Row_Number() Over(Order By GeneratedValues) 
As GeneratedValues From Level5) 

 

As the first refinement, we can generate only the required number of sequential numbers as required by introducing the concept of paging implemented with the help of the Offset-Fetch clause. That way the total number of sequential numbers that will get generated will get dictated by the upper bound mentioned by the user. The SQL query will then filter the requisite number of sequential numbers and provide the output.

Declare 
@LowerBound Bigint = 1,
@UpperBound Bigint = 429967296; 
 
-- By default, the CTE will always generate numbers starting with 1 going all the way upto 42,99,67,296
-- SQL Engine is smart and will only generate numbers up until the upperbound when defined
With 
 
-- 2^1 = 2 values
Level0 As (Select 1 as SeedValue1 Union All Select 1 As SeedValue2), 
 
-- 2^2 = 4 values

Level1 As (Select 1 As GeneratedValues From Level0 As A

Cross Join Level0 As B),

 
-- 2^4 = 16 values
Level2 As (Select 1 As GeneratedValues From Level1 As A 
Cross Join Level1 As B), 
 
-- 2^8 = 256 values

Level3 As (Select 1 As GeneratedValues From Level2 As A

Cross Join Level2 As B),

 
-- 2^16 = 65,536 values
Level4 As (Select 1 As GeneratedValues From Level3 As A 
Cross Join Level3 As B), 
 
-- 2^32 = 42,99,67,296 values

Level5 As (Select 1 As GeneratedValues From Level4 As A

Cross Join Level4 As B),

 
-- Obtain unique integers
Numbers As (Select Row_Number() Over(Order By GeneratedValues) 
As GeneratedValues From Level5) 
 
-- Return the numbers between the lower and the upper bound
Select @LowerBound + GeneratedValues – 1 
As GeneratedNumbers From Numbers
Order By GeneratedValues
Offset 0 Rows 
Fetch First @UpperBound - @LowerBound + 1 Rows Only;
Go

An example of execution of this query:

02 Execution Result of Complete QueryFigure 3 Result of execution of query between the lower bound of 1 and upper bound of 15

As the second refinement, we can create a table-valued function and execute it where ever and whenever we require a sequence of numbers

-- Function to return a table of numbers between lowerbound and upperbound
Create Function dbo.GetNumbers(@LowerBound As Bigint, @UpperBound As Bigint) 
Returns Table
As 
Return
 
-- By default, the CTE will always generate numbers starting with 1 going all the way upto 42,99,67,296
-- SQL Engine is smart and will only generate numbers up until the upperbound when defined
With 
 
-- 2^1 = 2 values
Level0 As (Select 1 as SeedValue1 Union All Select 1 As SeedValue2), 
 
-- 2^2 = 4 values

Level1 As (Select 1 As GeneratedValues From Level0 As A C

ross Join Level0 As B),

 
-- 2^4 = 16 values
Level2 As (Select 1 As GeneratedValues From Level1 As A 
Cross Join Level1 As B), 
 
-- 2^8 = 256 values

Level3 As (Select 1 As GeneratedValues From Level2 As A

Cross Join Level2 As B),

 
-- 2^16 = 65,536 values
Level4 As (Select 1 As GeneratedValues From Level3 As A 
Cross Join Level3 As B), 
 
-- 2^32 = 42,99,67,296 values

Level5 As (Select 1 As GeneratedValues From Level4 As A

Cross Join Level4 As B),

 
-- Obtain unique integers
Numbers As (Select Row_Number() Over(Order By GeneratedValues) 
As GeneratedValues From Level5) 
 
-- Return the numbers between the lower and the upper bound
Select 
@LowerBound + GeneratedValues - 1 
As GeneratedNumbers From Numbers
Order By GeneratedValues
Offset 0 Rows 
Fetch First @UpperBound - @LowerBound + 1 Rows Only;
Go

 

An example of the execution of the function:

Figure 4 Result of execution of the table valued function between the lower bound of 10 and upper bound of 30

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4 Result of execution of the table valued function between the lower bound of 10 and upper bound of 30

Hope this was helpful.

References:

1. T-SQL Window Functions: For data analysis and beyond (Developer Reference) by Itzik Ben-Gan
2. Common Table Expressions in SQL Server
3. Window Functions in SQL Server
4. Row Number Window Function in SQL Server

Normalization In Relational Databases

Designing a performant data model to accommodate data for an application can be a bit of a challenging task for a new developer. The difficulty arises from the lack of understanding of the nature of the data itself and using the tools relational database design to divide data logically into tables. In this short article, I have tried to shed light on normalization in relational databases.

Normalization in relational databases is the process of minimizing the duplication of data. This is done by spreading data into different tables, all logically connected via relationships. Normalization is a play on divide and rule strategy in the computer science domain.

Before we move further into the article, some of the terminologies bear a small introduction here.

Atomicity: In the context of relational databases, atomicity refers to the condition of data when it cannot be broken any further without loss of its comprehensiveness.


Attribute: Attribute describes characteristics of data. That can be the max length of the first name or the fact that the date of birth of a person in an application meant to store passport details can only go back in the past, say, 125 years.

Field: A field represents a data point which is a fundamental unit that makes up a record. A field stores an atomic value in an optimized form.

Record: A record represents a complete unit of data that is stored in a table. A record is made up of fields. Column: A column is an implementation in relational databases to store a field.

Row: A row is an implementation in relational databases to store a record.

Table: A table is a data structure in a relational database that is made up of rows and columns and facilitates the storage and querying of data.

Key: In the English language the word key as an adjective means something of crucial importance. There are different kinds of keys in a relational database and they play a crucial role in the functioning of a database.

Primary Key: A primary key is a construct in a table in a relational database that can be used to uniquely identify a record. We will discuss various kinds of primary keys at an opportune point in the article.

Now let’s take an example and thoroughly understand normalization.

Example: Consider that we have to store information about books in our database. The information comprises of the following:

  1. Book’s title
  2. Book’s author(s)
  3. ISBN
  4. Publisher’s name
  5. Publisher’s contact
  6. Date of publication
  7. Cost
  8. Category or categories in which the book can be filed
  9. The book should be individually identifiable based on the ISBN
  10. The book should be searchable on the basis of one of the categories that it can be filed in or the publisher’s name

Now, the way a new developer would think would be on the lines of accommodating all this information in just one table. Let’s understand the problems that one would encounter in such a scenario.

ISBN

Title

Authors

Publisher

PublisherContact

DateOfPublication

Price

Category

9781408855652

Harry Potter and the Philosopher's Stone

J.K. Rowling

Pottermore Publishing

Harry Potter

8-12-2015

200

Fantasy, Fiction

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

Brian Christian, Tom Griffiths

William Collins

John Doe

26-04-2017

250

Computer Science

938943243X

Rangbhoomi

Munshi Premchand

Fingerprint Publishing

Parakh Singhal

01-09-2019

672

Hindi Literature


Per the requirements, we can have multiple authors for a book and, a book can be categorized in multiple categories. Thus, we have the potential of storing multiple values for each of the fields of “Authors” and “Categories.”

Problems with this design: Anytime the number of authors changes for a book (by addition or update of a chapter(s) written by some new author, or by an existing author selling his stake in the works) or we want to increase or decrease the categories in which a book can be filed, we will have to read the existing data, modify it and then write it back into the existing row. This is tedious and time-consuming from a programming standpoint. A simple row deletion or addition independent of the existing row(s) would be far simpler and faster.

In order to eliminate the set of problems that arise under the category of organization of data in relational databases, we have the tool of normalization.

Normalization is a process under which we minimize data duplication and straighten relationships among the attributes of data. The various stages in this process are called "normal forms." There are 4 normal forms with a focus on organizing data progressively. In this article, we will focus on the 1st Normal Form, 2nd Normal Form, and the 3rd Normal Form.

1st Normal Form

The academic definition of the first normal form is that a table is in the first normal form if a relation has all the attributes in the atomic state.

In simple words, the definition merely states not to store multiple values in a field. In our example above, both the fields of “Categories” and “Authors” violate the first normal form, as we are storing multiple values in those fields. When we store only a single coherent value in a field that cannot be broken any further, and hence atomic, we achieve the 1st Normal Form. Therefore, we need to store authors and categories separately in our table, which would then look something like this:

ISBN

Title

Authors

Publisher

PublisherContact

DateOfPublication

Price

Category

9781408855652

Harry Potter and the Philosopher's Stone

J.K. Rowling

Pottermore Publishing

Harry Potter

8-12-2015

200

Fantasy

9781408855652

Harry Potter and the Philosopher's Stone

J.K. Rowling

Pottermore Publishing

Harry Potter

8-12-2015

200

Fiction

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

Brian Christian

William Collins

John Doe

26-04-2017

250

Computer Science

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

Tom Griffiths

William Collins

John Doe

26-04-2017

250

Computer Science

938943243X

Rangbhoomi

Munshi Premchand

Fingerprint Publishing

Parakh Singhal

01-09-2019

672

Hindi Literature

Now we have atomic values in all our fields and the table complies with the first normal form.

But the compliance with the first normal form has given rise to another problem. Now we cannot use the ISBN field as the primary key as it exists in multiple rows.

This is a good place to introduce the three types of primary keys that may exist in your data. They are:

  1. Natural Primary Key
  2. Composite Primary Key
  3. Surrogate Primary Key

Natural Primary Key: A primary key is called a natural primary key if the data that needs to be accommodated in a table has a field that is unique and can be used to identify a record uniquely among other records. In our example, before we implemented the first normal form, the ISBN can be organically used to uniquely identify a record, thus forming a natural primary key.

Composite Primary Key: A natural primary key that is made up of a bunch of fields in the data that needs to be accommodated in a table, is called a composite primary key. Composite primary key comes into the picture when a single field cannot be used to uniquely identify a record from among other records, but a combination of certain fields is unique and can serve the purpose. In our example, after the implementation of the first normal form, a combination of ISBN and author name can be used to uniquely identify a record from among other records, as an author’s name is only going to appear once for an ISBN. Thus, the combination of author name and ISBN can form a composite primary key.

Surrogate Primary Key: A key that is not related to data naturally in any manner whatsoever is called a surrogate primary key. Generally, it is a key that gets generated automatically by the database engine. The developer is just required to select the data type of the surrogate primary key. In our example, we can implement an int type or a GUID type surrogate primary key instead of relying on a composite primary key as described before.

Because we cannot use just the field of ISBN as the primary key, we will have to go for either a composite primary key or a surrogate primary key. In our case, we can go for a composite primary key and remain within the confines of our natural data, without introducing a surrogate primary key.

2nd Normal Form

In the second normal form, we work with a table that is already in the first normal form. Then we remove the partial dependency from the table if any.

Partial Dependency: Partial dependency is a condition whereby the fields contained within a record depend only upon one portion of the primary key.

In our example, when we achieved the first normal form, we introduced a composite primary key which was made up of the ISBN and the author’s name fields. Hence, the data in the table depended on two fields that made up the primary key. To push our example to comply with the second normal form, we will have to bifurcate the data into two distinct tables – each portion of the composite primary key representing a separate table. In our example we will create two tables, one housing data about authors and the other housing data about the books.

Our tables would look something like this:

Books table with the field “ISBN” as the natural primary key:

ISBN

Title

Publisher

PublisherContact

DateOfPublication

Price

Category

9781408855652

Harry Potter and the Philosopher's Stone

Pottermore Publishing

Harry Potter

8-12-2015

200

Fantasy

9781408855652

Harry Potter and the Philosopher's Stone

Pottermore Publishing

Harry Potter

8-12-2015

200

Fiction

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

William Collins

John Doe

26-04-2017

250

Computer Science

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

William Collins

John Doe

26-04-2017

250

Computer Science

938943243X

Rangbhoomi

Fingerprint Publishing

Parakh Singhal

01-09-2019

672

Hindi Literature


Authors table with the field “AuthorId” as the surrogate primary key:

AuthorId

Authors

1

J.K. Rowling

2

Brian Christian

3

Tom Griffiths

4

Munshi Premchand


Now both the tables do not have any partial dependency and have graduated to the second normal form.

3rd Normal Form

In the third normal form, we work with a table that is in the second normal form and aim to remove any transitive dependency, if any.

Transitive Dependency: Transitive dependency is a condition whereby a field is not dependent on the primary key in any way.

In our example, the publisher’s contact is dependent on the publisher and does not in any way connect to the book and hence ISBN, which happens to the natural primary key in our Books table. We will push the publisher’s contact into a separate table to achieve the third normal form. Our tables would then look something like this:

Books table with ISBN as the natural primary key and the transitive dependency of publisher’s contact field removed:

ISBN

Title

Publisher

DateOfPublication

Price

Category

9781408855652

Harry Potter and the Philosopher's Stone

Pottermore Publishing

8-12-2015

200

Fantasy

9781408855652

Harry Potter and the Philosopher's Stone

Pottermore Publishing

8-12-2015

200

Fiction

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

William Collins

26-04-2017

250

Computer Science

9780007547999

Algorithms to Live By: The Computer Science of Human Decisions

William Collins

26-04-2017

250

Computer Science

938943243X

Rangbhoomi

Fingerprint Publishing

01-09-2019

672

Hindi Literature


Authors table with the field “AuthorId” as the surrogate primary key:

AuthorId

Authors

1

J.K. Rowling

2

Brian Christian

3

Tom Griffiths

4

Munshi Premchand


Publisher Contact table with the field “Publisher” as the natural primary key:

Publisher

PublisherContact

Pottermore Publishing

Harry Potter

William Collins

John Doe

Fingerprint Publishing

Parakh Singhal


We can further improve upon the final design by adopting surrogate primary keys in all our tables. That way if there’s any update of data in fields acting as natural primary keys, we will not have to update the same in the tables utilizing them as foreign keys. Also, we reduced the duplication of data in the Books table by creating a bridge entity to accommodate the many-to-many relationship between a book and the categories into which it can be classified.

Data model with surrogate primary keys:

Normalized Data Model with Bridge Tables to handle Many-To-Many RelationshipsFigure 1 Normalized Data Model with Bridge Tables to handle Many-To-Many Relationships


In a nutshell:

1st Normal Form: Aim for atomicity of data by elimination of multiple values in a single field and identification of a field that can act as a primary key.

2nd Normal Form: Make sure the table already exists in the first normal form and eliminate any partial dependencies.

3rd Normal Form: Make sure the table already exists in the second normal form and eliminate any transitive dependencies.

I hope this article helped you in learning the concept of normalization.

Events and Delegates – Part 3 of 3

In part 2 of this series, I introduced delegates as the implementation of the callback mechanism in .net. We saw how a delegate act as a pipeline connecting code that generates information to the code that uses that information.

In this post, I will introduce events and how they fit neatly into the entire picture and complement the abstraction that we started with delegates.

Non-Technical Overview of Events

Events in real life are important occurrences which, when happen, result in something done. Similarly, events in object-oriented programming languages represent important occurrences which when exercised need something done.

Technical Overview

In the previous post we saw how we can use a delegate as a pipeline to broadcast the information from an object to the subscribers.

Events take this concept to the next level and defer the work of creating the underlying delegate to the compiler. All a developer has to worry about is creating the right kind of event that fits in the domain model.

Events can be created with the help of the “event” keyword. We have to use a user defined delegate or a .net framework provided delegate to let the compiler know what type of underlying delegate needs to be formed behind the scenes. To keep the article short and focus on understanding the concept, I will only cover the EventHandler delegate provided by the .net framework in this post.

The EventHandler delegate is provided as a ready to use delegate. Its signature comprises passing an object type and an object of the EventArgs class. The first parameter is an object of the class that exposes the event itself, and the EventArgs object represents additional information related to the emitted event that might not be contained in the object of the class itself. The delegate returns a void.

public void EventHandler (object? sender, EventArgs e);
 

To pass any event information that might not be contained in the object, we can create a user-defined class derived from EventArgs class, and provide the peripheral information and functionality related to the emitted event via properties and methods.

The concepts covered in the technical overview will become a bit clearer in the example code covered in use cases.

Use Cases of Events

Events are calls that are raised when programmed conditions are met. Some examples are:

1. To broadcast changes in an object’s state
2. To broadcast fulfilment of a condition

We saw in the previous post how we have to create additional code to register and un-register methods that want to use the delegates. With the event keyword we can bypass all that boilerplate code and directly work with events and event handlers.

Broadcast changes in an object’s state

I am going to continue an example discussed in my previous post. Human beings are sensitive to their ambient temperature. In the code below, we will create a class “Human” and broadcast the response of its object to different temperatures. In my previous post, this was implemented with the help of delegates, but in this post, I will use events to implement the same.

public class Human
    {
        // Basic properties
        public int Id { get; set; }
        public string Name { get; set; }
        public float MinTemp { get; set; }
        public float MaxTemp { get; set; }
        private string feeling;
        public string Feeling
        {
            get
            {
                return feeling;
            }
            set
            {
                feeling = value;
 
                // If there are subscribers to the OnTemperatureChanged event
                // change in Feeling property will trigger off the event
                if (OnTemperatureChanged != null)
                {
                    OnTemperatureChanged(this, new EventArgs());
                }
            }
        }
 
        // Events
        public event EventHandler OnTemperatureChanged;
 
        public Human()
        {
 
        }
 
        // User defined constructor
        public Human(int id, string name, float minTemp, float maxTemp, string feeling)
        {
            Id = id;
            Name = name;
            MinTemp = minTemp;
            MaxTemp = maxTemp;
            Feeling = feeling;
        }
 
        public void TemperatureSensation(float temperature)
        {
 
            if (temperature > MaxTemp)
            {
                // If the OnTemperatureChanged event's invocation list is not empty
                // change in Feeling property's value will trigger the event to fire off
                Feeling = "I am feeling hot";
            }
            else if (temperature < MinTemp)
            {
                Feeling = "I am feeling cold";
            }
            else
            {
                Feeling = "I am feeling normal";
            }
        }
    }

In the code above, we have a class "Human" with five properties, of which three are of importance - MinTemp, MaxTemp and Feeling. The method TemperatureSensation accepts a float type parameter and depending on where the value stands in the spectrum between MinTemp and MaxTemp, sets the value of the Feeling property. We have an explicit implementation of the Feeling property and if there are subscribers to the OnTemperatureChanged event, it fires off when the value of the Feeling property’s underlying variable is set to a new value. In the following code, we create an object of the Human class and test the functionality of event declared in the Human class.

class Program
    {
        static void Main(string[] args)
        {
            Human human = new Human() 
            {
                Id = 1,
                Name = "Parakh Singhal",
                MaxTemp = 45f,
                MinTemp = 10f, 
                Feeling = null
            };
 
            human.OnTemperatureChanged += OnTemperatureChangeHandler;
 
            human.TemperatureSensation(8f);
            human.TemperatureSensation(25f);
            human.TemperatureSensation(50f);
 
            Console.WriteLine("Press any key to terminate the program...");
            Console.ReadKey();
        }
 
        public static void OnTemperatureChangeHandler(object sender, EventArgs e)
        {
            if (sender is Human)
            {
                Human human = sender as Human;
                Console.WriteLine(human.Feeling);
            }
        }
    }

 

In the code above we created an object of the Human class and instantiated it with some data. Then, we enrolled in a method “OnTemperatureChangeHandler” that matched the “OnTemperatureChanged“ event’s underlying delegate’s signature and return type. Then we invoked the TemperatureSensation method on the human object which then broadcasted the messages to the event’s subscribed members.

The code when executed provides the following output:

3 of n 01 Object State

Contrast this code with the one created in the previous post, where not only we operated on the back of a delegate object, but also created the registration and un-registration methods for the methods to get enrolled in the delegate’s invocation list, to prevent direct access to delegate object’s invocation list. We also had more lines of programming as we had to trigger off the delegate wherever the state of the object was deemed to change. The new code is not only concise but also more maintainable.

Here we leverage the code that gets created by C# compiler for us. The event keyword, behind the scenes gets expanded into a delegate accepting the standard input parameters of an object type and an EventArgs class object and returning a void. We can then enrol methods with the same signature and return type, using the overridden “+” operator. Methods can be un-registered using the “-“ operator.

Broadcast of fulfilment of a certain condition

Some conditions when they get fulfilled, warrant an intimation. The information about the fulfilment of condition is broadcasted via events and the subscribers then process the information as they deem fit.

Let’s take an example of a library where the information about successful checkout of a book is processed.

 

public class Author
    {
        public int AuthorId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }
public class Book
    {
        public int BookId { get; set; }
        public string ISBN { get; set; }
        public string Title { get; set; }
        public Author Author { get; set; }
        public int PageCount { get; set; }
        public bool IsCheckedOut { get; set; }
    }
public class Member
    {
        public int MemberId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public int NumberOfBooksCheckedOut { get; set; }
        public List<Book> BooksCheckedOut { get; set; }
    }
public Library()
        {
            Books = new List<Book>()
                        {
                            new Book()
                            {
                                BookId = 1,
                                Title = "Alice in Wonderland",
                                IsCheckedOut = false,
                                PageCount = 200,
                                Author = new Author()
                                {
                                    AuthorId = 1,
                                    FirstName = "Lewis",
                                    LastName = "Carroll"
                                }
                            },
                            new Book()
                            {
                                BookId = 2,
                                Title = "Bad Blood",
                                IsCheckedOut = false,
                                PageCount = 350,
                                Author = new Author()
                                {
                                    AuthorId = 2,
                                    FirstName = "John",
                                    LastName = "Carreyrou"
                                }
                            },
                            new Book()
                            {
                                BookId =  3,
                                Title = "The Dream Machine",
                                IsCheckedOut = false,
                                PageCount = 250,
                                Author = new Author()
                                {
                                    AuthorId = 3,
                                    FirstName="Mitchell",
                                    LastName = "Waldrop"
                                }
                            },
                            new Book()
                            {
                                BookId = 4,
                                Title = "The Structure of Scientific Revolution",
                                IsCheckedOut = false,
                                PageCount = 500,
                                Author = new Author()
                                {
                                    AuthorId = 4,
                                    FirstName = "Thomas",
                                    LastName=  "Kuhn"
                                }
                            },
                            new Book()
                            {
                                BookId =5,
                                Title = "Sapiens: A Brief History of Humankind",
                                IsCheckedOut = false,
                                PageCount = 450,
                                Author =new Author()
                                {
                                    AuthorId = 5,
                                    FirstName = "Yuval",
                                    LastName = "Hariri"
                                }
                            }
                        };
            Members = new List<Member>()
                        {
                            new Member()
                            {
                                MemberId = 1,
                                FirstName = "Parakh",
                                LastName = "Singhal",
                                NumberOfBooksCheckedOut = 0
                            },
                            new Member()
                            {
                                MemberId = 2,
                                FirstName = "Prateek",
                                LastName = "Mathur",
                                NumberOfBooksCheckedOut = 0
                            },
                            new Member()
                            {
                                MemberId =3,
                                FirstName = "Sumant",
                                LastName = "Sharma",
                                NumberOfBooksCheckedOut = 0
                            }
                        };
        }
 
        public void CheckOutBook(int memberId, int bookId)
        {
            Book book = Books.Find(book => book.BookId == bookId);
            Member member = Members.Find(member => member.MemberId == memberId);
 
            if (book.IsCheckedOut)
            {

Console.WriteLine("Apologies, but book is already checked out.

Please select another book.");

            }
            else if (member.NumberOfBooksCheckedOut >= 2)
            {

Console.WriteLine("Apologies, but there are already 2 books

checked out to you.");

            }
            else
            {
                book.IsCheckedOut = true;
                member.NumberOfBooksCheckedOut += 1;

OnSuccessfulCheckOut?.Invoke(this,

new LibraryEventArgs() { Message = "Checkout of book is successful" });

 
            }
        }
    }
 
public sealed class LibraryEventArgs : EventArgs
    {
        public string Message { get; set; }
    }

 

The model created above is consumed in the console application.

class Program
    {
        static void Main(string[] args)
        {
            Library library = new Library();
 
            library.OnSuccessfulCheckOut += CheckoutSuccessfulEventHandler;
 
 
            library.CheckOutBook(1, 1);
 
            Console.WriteLine("Press any key to exit the program...");
            Console.ReadKey();
        }
 
        private static void CheckoutSuccessfulEventHandler(object sender, EventArgs e)
        {
            if (sender is Library && e is LibraryEventArgs)
            {
                LibraryEventArgs libraryEventArgs = e as LibraryEventArgs;
                Console.WriteLine(libraryEventArgs.Message);
            }
        }    
}

 

When executed the following output appears:

3 of n 02 Object State

The example shows usage of event for emitting information for an important activity with the help of event. In the example when the book is successfully checked out, the event OnSuccesfulCheckOut is fired, provided that the underlying delegate’s invocation list is not empty, and the information regarding the successful checkout of the book is emitted to the subscribed event handlers.

Summary

Delegates help separation with the process of emitting the information and processing that information, events take things to the next level by raising the level of abstraction and using a more natural vocabulary.

I hope that this post helped you in rounding out the concept of events and delegates and how events are built on the foundation of delegates, which is the reason, delegates are discussed first in academic texts.

References

  1. Pro C# 8 with .NET Core 3 by Andrew Troelsen and Phil Japikse
  2. CLR via C# by Jeffrey Richter

Delegates And Events – Part 2 of 3

In my last post, I explained the concept of events, delegates, and event handlers at a high level. In this post, let’s take a look at delegates. Why delegates and not events, because among them, they are the tricky ones and can be understood in relative isolation.

Delegates form the glue connecting the events and event handlers and are instrumental in conveying to all the subscribed event handlers about the firing of an event. What I will try to show in this post are the following:
1. High-level conceptual coverage of delegates,
2. Technical overview of delegates,
3. How a delegate can execute a connected method, aka an event handler

High-Level Overview:

Consider the following scenario where you have a pipeline going out from a pumping station and connected to that pipeline are houses and industries. Houses will use the supplied water for cleaning, cooking, etc. while the industries will use the same water for chemical and other manufacturing purposes. Water Pumping station supplying water to homes and industries


 

The pumping station here is synonymous with an event firing when an action of interest takes place. The water is the information conveyed by the event, and the houses and industries are the event handlers which process the supplied water, aka the information in the desired form. Delegates form the plumbing connecting the events and event handlers.

Before we move any further, we must understand what is a callback mechanism.

Callback Mechanism

Essentially, a callback mechanism that lets us chain pieces of code together in such a way that the first piece calls the second. The call can be made in two ways:

1) Synchronous callbacks, in which the first code (calling code) waits for the second code (called code) to finish the work and deliver the result, and, only after that resumes execution.
E.g. You can have a windows form application which has a button which upon clicking starts execution of a long running process. While the process is running the windows form will remain unresponsive.

2) Asynchronous callbacks, where the first code calls the second code but does not wait for it to finish the work. The work by the second piece of code is carried out on a different thread.

Callback mechanism is known by different names in different programming languages. In C, it is known as function pointers, and in C# it is called “delegates”.

In C#, delegates form the plumbing that allows one code to call another code synchronously or asynchronously. Adding to that is the fact that they are type-safe, i.e. you can only call methods that comply with the delegate’s signature and return type. In this post we will focus only on the synchronous calling via delegates

Use cases for delegates

Delegates can be used in two scenarios:

1. To broadcast changes in an object’s state
2. To broadcast fulfillment of a condition

Delegates behind the scenes are a special type of class, which means that we need to create an object of delegate type to use it for callback purposes.

Let’s dive into practical examples that show the use cases in action.

Broadcast changes in an object’s state


Consider human beings and their sensitiveness to ambient temperature. We can create a class “Human” and broadcast the response of its object to different temperatures.

public class Human
    {
        // Basic properties
        public int Id { get; set; }
        public string Name { get; set; }
        public float MinTemp { get; set; }
        public float MaxTemp { get; set; }
 
        //Delegate declaration
        public delegate void TemperatureSensationHandler(string message);
 
        // Member variable of the declared delegate
        public TemperatureSensationHandler temperatureSensationMembers;
 
        public Human()
        {
        }
 
        // User defined constructor
        public Human(int id, string name, float minTemp, float maxTemp)
        {
            Id = id;
            Name = name;
            minTemp = MinTemp;
            maxTemp = MaxTemp;
        }
 
        public void TemperatureSensation(float temperature)
        {
            if (temperatureSensationMembers!= null)
            {
                if (temperature > MaxTemp)
                {
                    temperatureSensationMembers("I am feeling hot");
                }
                else if (temperature < MinTemp)
                {
                    temperatureSensationMembers("I am feeling cold");
                }
                else
                {
                    temperatureSensationMembers("I am feeling normal");
                }
            }         
        }
    }

 

In the code above, we have a class "Human" with four properties, of which two are of importance - MinTemp and MaxTemp. The method TemperatureSensation accepts a float type parameter and depending upon where the value stands in the spectrum between MinTemp and MaxTemp, broadcasts an appropriate message. In the following code, we create an object of the Human class and test the functionality of delegate declared in the Human class.

class Program
    {
        static void Main(string[] args)
        {
            Human human = new Human() 
{ Id = 1, Name = "Parakh", MinTemp = 25.5f, MaxTemp = 45.0f };
 
            human.temperatureSensationMembers += OnTemperatureSensationEvent;
            human.TemperatureSensation(100);
            human.TemperatureSensation(30);
            human.TemperatureSensation(10);
            Console.ReadLine();
        }
 
        public static void OnTemperatureSensationEvent(string message)
        {
            Console.WriteLine(message);
        }
    }

 

In the code above we created an object of the Human class and instantiated it with some data. Then, we enrolled in the delegate object a method that matched the delegate’s signature and return type. Then we invoked the TemperatureSensation method on the human object which then broadcasted the messages to the delegate’s subscribed members.

The above code when executed yields the following outputBoradcast changes in an object's state

Now there’s one flaw in the code above. Delegate class at the .Net framework level overrides the “+” and “-“ operators, allowing the subscription and unsubscription of complying methods. It means that having direct access to the delegate object is sensitive and a developer should not be able to empty a delegate object’s invocation list by using the “-“ operator repeatedly.

We thus encapsulate the subscription and unsubscription of the methods to the delegate object via methods.

public class Human
    {
        // Basic properties
        public int Id { get; set; }
        public string Name { get; set; }
        public float MinTemp { get; set; }
        public float MaxTemp { get; set; }
 
        //Delegate declaration
        public delegate void TemperatureSensationHandler(string message);
 
        // Member variable of the declared delegate
        private TemperatureSensationHandler temperatureSensationMembers;
 
        //Registration and un-registration methods
        public void RegisterTempSensationHandlerMethods(TemperatureSensationHandler methodToRegister)
        {
            temperatureSensationMembers += methodToRegister;
        }
 
        public void UnRegisterTempSensationMethods(TemperatureSensationHandler methodToUnregister)
        {
            if (temperatureSensationMembers != null)
            {
                temperatureSensationMembers -= methodToUnregister;
            }
        }
 
        public Human()
        {
 
        }
 
        // User defined constructor
        public Human(int id, string name, float minTemp, float maxTemp)
        {
            Id = id;
            Name = name;
            minTemp = MinTemp;
            maxTemp = MaxTemp;
        }
 
        public void TemperatureSensation(float temperature)
        {
            if (temperatureSensationMembers!= null)
            {
                if (temperature > MaxTemp)
                {
                    temperatureSensationMembers("I am feeling hot");
                }
                else if (temperature < MinTemp)
                {
                    temperatureSensationMembers("I am feeling cold");
                }
                else
                {
                    temperatureSensationMembers("I am feeling normal");
                }
            }         
        }
    }

With the modified code, we now can register and un-register the methods without exposing the delegate object to a developer.

The code above yields the same output.

Boradcast changes in an object's state

Broadcast of fulfilment of a condition


Some conditions warrant broadcast of information when they materialize. Examples would include intimation upon sending or receiving a message, dispatch, and delivery of a physical item in case of e-commerce applications, etc.

Here we will take an example of checking out a book from a library.

We have four classes - Book, Member, Author, and Library. The idea in this example is to broadcast a message upon execution of certain activities like searching for a book, searching for member records, and successful or unsuccessful checking out of a book.

public class Author
   {
        public int AuthorId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }
 
public class Book
    {
        public int BookId { get; set; }
        public string ISBN { get; set; }
        public string Title { get; set; }
        public Author Author { get; set; }
        public int PageCount { get; set; }
        public bool IsCheckedOut { get; set; }
    }
 
public class Member
    {
        public int MemberId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public int NumberOfBooksCheckedOut { get; set; }
        public List<Book> BooksCheckedOut { get; set; }
    }
 
    public class Library
    {
        public int LibraryId { get; set; }
        public List<Book> Books { get; set; }
        public List<Member> Members { get; set; }
        public Library()
        {
            LibraryId = 1;
            Books = new List<Book>()
                        {
                            new Book()
                            {
                                BookId = 1, 
                                Title = "Alice in Wonderland", 
                                IsCheckedOut = false, 
                                PageCount = 200,
                                Author = new Author() 
                                { 
                                    AuthorId = 1, 
                                    FirstName = "Lewis", 
                                    LastName = "Carroll"
                                }
                            },
                            new Book()
                            { 
                                BookId = 2, 
                                Title = "Bad Blood", 
                                IsCheckedOut = false, 
                                PageCount = 350,
                                Author = new Author()
                                { 
                                    AuthorId = 2, 
                                    FirstName = "John", 
                                    LastName = "Carreyrou"
                                }
                            },
                            new Book()
                            { 
                                BookId =  3, 
                                Title = "The Dream Machine", 
                                IsCheckedOut = false, 
                                PageCount = 250,
                                Author = new Author()
                                { 
                                    AuthorId = 3, 
                                    FirstName="Mitchell", 
                                    LastName = "Waldrop"
                                }
                            },
                            new Book()
                            { 
                                BookId = 4, 
                                Title = "The Structure of Scientific Revolution", 
                                IsCheckedOut = false,
                                PageCount = 500, 
                                Author = new Author()
                                { 
                                    AuthorId = 4, 
                                    FirstName = "Thomas", 
                                    LastName=  "Kuhn"
                                }
                            },
                            new Book()
                            { 
                                BookId =5,
                                Title = "Sapiens: A Brief History of Humankind", 
                                IsCheckedOut = false,
                                PageCount = 450, 
                                Author =new Author() 
                                { 
                                    AuthorId = 5, 
                                    FirstName = "Yuval", 
                                    LastName = "Hariri"
                                }
                            }
                        };
            Members = new List<Member>()
                        {
                            new Member()
                            { 
                                MemberId = 1, 
                                FirstName = "Parakh", 
                                LastName = "Singhal", 
                                NumberOfBooksCheckedOut = 0
                            },
                            new Member()
                            { 
                                MemberId = 2, 
                                FirstName = "Prateek", 
                                LastName = "Mathur", 
                                NumberOfBooksCheckedOut = 0
                            },
                            new Member()
                            { 
                                MemberId =3, 
                                FirstName = "Sumant", 
                                LastName = "Sharma", 
                                NumberOfBooksCheckedOut = 0
                            }
                        };
 
        }
 
        //Delegate which can accept methods having a string parameter
        public delegate void CheckoutHandler(string message);
 
        private CheckoutHandler checkOutHandlerMembers;
 
        public void RegisterCheckOutHandlerMembers(CheckoutHandler methodToRegister)
        {
            checkOutHandlerMembers += methodToRegister;
        }
 
        public void UnRegisterCheckOutHandlerMembers(CheckoutHandler methodToUnregister)
        {
            if (checkOutHandlerMembers != null)
            {
                checkOutHandlerMembers -= methodToUnregister;
            }
        }
 
        public void CheckOutBook(int MemberId, int bookId)
        {
            //The operation of search will be printed out by the program only 
            //if there are members to the delegate type
            if (checkOutHandlerMembers != null)
            {
                checkOutHandlerMembers("Searching for book and Member records");
            }
 
            Book bookToBeCheckedOut = Books.Find(book => book.BookId == bookId);
            Member MemberCheckingOut = Members.Find(Member => Member.MemberId == MemberId);
 
            //The operation of success in finding the book and Member will
            //only be printed if there are members to the delegate type
            if (bookToBeCheckedOut != null &&
                MemberCheckingOut != null && 
                checkOutHandlerMembers != null)
            {
                checkOutHandlerMembers("Book and Member records found.");
            }
 
            if (bookToBeCheckedOut.IsCheckedOut == false)
            {

//The final result, whether the book can be checked

//out needs to be printed whether

                //there are any members to the delegate type. 
                //That's why if-else statement with duplicate
                //console writes
                if (checkOutHandlerMembers != null)
                {
                    checkOutHandlerMembers("Congratulations, the book is available");
                }
                else
                {
                    Console.WriteLine("Congratulations, the book is available");
                }
            }
            else
            {
                if (checkOutHandlerMembers != null)
                {

checkOutHandlerMembers("Apologies, but the book is already

checked out to an existing Member.");

                }
                else
                {

Console.WriteLine("Apologies, but the book is

already checked out to an existing Member.");

                }
            }
        }
    }
}

 

The model created above is consumed in the console application.

class Program
    {
        static void Main(string[] args)
        {
            Library library = new Library();
            library.RegisterCheckOutHandlerMembers(OnCheckingOutEvent);            
            library.CheckOutBook(1, 1);
            
            Console.ReadLine();
        }
 
        public static void OnCheckingOutEvent(string message)
        {
            Console.WriteLine(message);
        }
    }

 

When executed the following output appears:

Broadcast of fulfillment of a conditionIn the output shown above, the program emitted certain messages related to the searching of the member and book details and successful checkout of the book. Now, remember, the domain is responsible for emitting the information, but what we do with the information is entirely in the hands of the developer. We can compose an email or an SMS and embed the emitted message or have it logged in a log file or a database.

Summary

Delegates help in separating the act of broadcasting information and how that information is consumed.
I hope this post helped in laying a strong foundation by presenting the concept of delegates in an easy to understand manner.

References

  1. Pro C# 8 with .NET Core 3 by Andrew Troelsen and Phil Japikse
  2. CLR via C# by Jeffrey Richter

Delegates And Events–Part 1 of 3

In my experience, if there’s one topic in C# that baffles the programmers the most, it is the topic of events and delegates. And the root of the state of confusion is not the lack of attempt on the programmer’s part, but the way the books on programming languages explain the concept – in a dry academic sense, devoid of any relatability to real life.

This blog post is my attempt to explain the concept of events and delegates in an easy way.

Events in real life are occurrences where something of importance happened to an entity. Let me elaborate on that. Consider a normal human being. Events in the life of a normal human being would be:
1. Birth
2. First day of school
3. Graduation from school
4. First day of college
5. Graduation from college
6. First pay-check
7. Getting married
8. First child
…and so on. The aforementioned occurrences would normally be considered important, hence endowed with a special term, events.

Whenever an event occurs, there’s something done. Birth of a child is celebrated, graduation from school is celebrated with a farewell party, marriages are celebrated with near and dear ones, etc.

This is what is the inspiration for the abstraction, aptly called events in modern object-oriented programming languages. Events are important occurrences in the life-span of an entity. An entity can be a button in a Windows Form, a dropdown on a web page, or some kind of a custom activity in a user-defined object like a domain class. Correspondingly, the event would be pressing a button, changing the selected element in a dropdown or value of a property in a domain class.

Whenever some event occurs or gets fired, something is done. That something is done in some defined code, called the event handler. An event handler is essentially a method, with details of what needs to be done when the event is fired.

Now that we have understood the concept of events and event handlers, let’s understand the concept of delegates. Delegates are like the consciousness that pushes us to do something when an event occurs. Essentially, a delegate is the glue that ties an event and an event handler together.

To summarise:
When something of importance happens -> Event
What we do in case of an event -> Event Handler
The glue connecting the event and event handler –> Delegate

I hope that this post was able to sow the idea of events and delegates your mind in a non-technical and relatable way. In future posts, I will shed more light on the topic.

How to Create and Apply Git Patch Files

Introduction

In the years during which Git evolved, there was no provision of using a central remote repository which different contributors could use to merge their changes to a feature branch or request pull requests. Sending patch file(s) via email was the name of the game. A patch file is a handy way of encapsulating the changes introduced into the repository as part of a commit, into a single file. Patches can be created on a per commit basis or a bunch of commits can be squashed into a single patch file.

Git Format-Patch

The command that is used to create a patch file is format-patch. Various options are available that can be used in conjunction with the command to modify the output as desired. Let’s take a look at the most common operations that use format-patch command.

To learn about the command let’s create a repository and track a text file with some text entries.

$ mkdir PatchFiles
$ cd PatchFiles
$ git init
$ touch file1.txt
$ printf "first line" > file1.txt
$ git add .
$ git commit -m "First commit"
$ printf "\nsecond line" >> file1.txt
$ git commit -am "Second commit"
$ git log --online master

Creation of master branch

Next let’s create a branch named develop off of the master branch and track a text file with some text entries.

$ git branch develop
$ git checkout develop
$ touch file2.txt
$ printf "first line" > file2.txt
$ git add .
$ git commit -m "First commit in develop branch"
$ printf "\nsecond line" >> file2.txt
$ git commit -am "Second commit in develop branch"
$ git log --oneline

Creation of develop branch

Now that we have laid the groundwork, let’s understand the format-patch command. The format-patch command takes in the name of the branch against which you want to compare the commits of the current branch pointed to by the INDEX pointer. By default, the format-patch creates a patch file for every commit not available in the target branch.

$ git format-patch master
$ dir

Format Patch Multiple files

In earlier times, when there was no remote repository, such patch files used to be mailed by project contributors to the project maintainer.
Next, we will discuss the two commands that can be used to apply the patch files, i.e., apply and am.

Git Apply

There could be situations that demand the application of changes in a patch to the target branch, but, not include the corresponding commit message into the commit history of the target branch. Git’s apply command is used in such situations.

Apply command helps you to test out the changes introduced by the patch locally before you formally commit them into the repository. Since the changes are applied to the working directory, you can use git diff to view the changes applied. Also, the commit message can be a bit more descriptive signifying the nature of the changes applied.

Let’s see the command in action in code.

$ git checkout master
$ git apply 0001-First-commit-in-develop-branch.patch
$ git status
$ git log --oneline
$ dir

 05 Apply command - cropped

Git AM

Git’s am (Apply Mailbox) command is used in a situation when you are assured of the commits in a patch and want them to be applied verbatim, along with their corresponding metadata consisting of author’s information and commit message.

$ git checkout master
$ git am 0001-First-commit-in-develop-branch.patch
$ git status
$ git log --oneline
$ dir

06 AM command - cropped

Word of Caution

If you are selectively applying patches, and if a patch contains an artifact that comes to life in a prior patch, then the application of such a patch will fail. This is because patch files are commit-specific and carry changes specific to the commit.
In such cases, you will be required to sequentially apply patches in chronological order, first bringing the artifact into existence, followed by any changes in the artifact.

$ git checkout master
$ git am 0002-Second-commit-in-develop-branch.patch

07 Problem with patch application - cropped

 

We were not able to successfully apply the second patch as the file2.txt came into existence in the first commit, hence first patch. It would be imperative here, to first apply the first patch and then the second patch.

Conclusion

Git provides the ability to create patches as discreet units signifying the evolution of a project and enables the sharing of this evolution in the absence of a central remote repository.

Git offers two ways to apply patches as per requirements.

Git’s apply command applies the changes locally to the working directory and gives you the freedom to further introduce changes in the artifacts before formally committing into the repository.

Git’s am command applies the patches in the repository and includes the commit messages contained in the patches. It retains vital metadata like author information, email, etc..

A short note on Git Cherry-Pick

Key Takeaway

In the real world, situations arise where some urgent work is done on a codebase that is required to be released to mainstream users. This work then needs to be bought down into the feature branches to maintain parity with the master branch.

In Git, we have the command of Rebase to handle such a situation, but in some cases, a more surgical approach is required, whereby, only the commit having the changes is required to be applied to the feature branch and exclude anything else which might otherwise pollute the feature branch. This is what Git’s cherry-pick command achieves.

Read On

One of the founding premises of Git is to commit often and commit early. This aspect of the foundation enables you, the user, to see the evolution of your work over time and enables Git to take the responsibility to manage the evolution in a discreet order via commits. By having small commits, you enable Git to recover your work in a granular manner, should the situation come to be, as desired. And dovetailing this practice of small and frequent discreet commits, revolves the power of the command of cherry-pick.

Git’s cherry-pick command helps you apply a commit in a surgical manner such that it only brings in the changes corresponding to a commit to the branch under concern. This command can be thought to be useful in scenarios whereby you have to bring in a limited number of changes that are pertinent to your work. Examples:
1. There’s a hotfix branch created from the master branch and you need to include the hotfix in your feature branch.
2. There are a couple of commits that you need to take from some other feature branch and incorporate in your branch

Let’s take a look at an example to understand the cherry-pick command.

$ mkdir cherrypick
$ cd cherrypick
$ git init
$ touch file1.txt
$ printf "first line" > file1.txt
$ git add .
$ git commit -m "First commit in master branch"
$ printf "\nsecond line" >> file1.txt
$ git commit -am "Second commit in master branch"
$ printf "\nthird line" >> file1.txt
$ git commit -am "Third commit in master branch"
$ git branch develop
$ git checkout develop
$ touch file2.txt
$ printf “first line” >> file2.txt
$ git add .
$ git commit -m “First commit in develop branch”
$ printf “\nsecond line” >> file2.txt
$ git commit -am “Second commit in develop branch”
$ printf “\nthird line” >> file2.txt
$ git commit -am “Third commit in develop branch”
$ git checkout master
$ git log --oneline develop
$ git cherry-pick SHA1 hash of the second commit from the develop branch

 

Resolve any conflicts that arise from the cherry pick here and then run a git log

$ git log --oneline master

 

In this example, we created a directory aptly named cherrypick and then inserted some text entries into a text file named file1.txt. We then created a branch called develop from the master branch. In the develop branch we created some entries in a text file named file2.txt and then moved back to the master branch. Now listed the log entries from the develop feature branch and noted down the SHA1 hash of the commit corresponding to which we want to pick the changes from the develop branch. In this case, we picked the second commit from the develop branch.

01 Creation of repoFigure 1 Creating the repository and some initial commits

02 In Develop BranchFigure 2 Creation of a branch and some initial commits

03 In Develop BranchFigure 3 Going back to the branch where a cherry-picked commit needs to be merged

Finally, we used Git’s cherry-pick command in the master branch in conjunction with the SHA1 hash noted before. This will bring in the changes from the develop branch into the master branch that were made corresponding to the commit hash. In doing so, you may be required to resolve any conflict that may arise.

04 In Master BranchFigure 4 Process of cherry picking may throw up some conflicts which would require intervention

05 In Master BranchFigure 5 Once a commit is cherry picked, its content will then become available in the target branch

This way we can literally cherry-pick the changes corresponding to specific commit(s) in a branch and bring them to the branch of our choice.

Hope this was helpful.