The new data set describes who likes whom denoted by the person ID. The record type of the data set is:
Record_Like := RECORD
UNSIGNED8 SourceId;
UNSIGNED8 TargetId;
END;
And here are some sample data:
likes_file := DATASET([
{1,2},{1,3},{1,4},{1,5},{1,6},{1,7},{1,8},{1,9},{1,10},
{2,3},{2,4},{2,6},{2,7},{2,8},
{3,8},{3,9},{3,11},
{4,5},
{5,7},
{6,7},{6,9},
{7,4},{7,5},{7,6},
{8,3},{8,9},
{9,3},{9,6},{9,8},
{12,1}], Record_Like);
It's hard to figure out who likes whom just by looking at a pair of IDs in the likes_file data set. To make it easier, we'll combine the members_file data set we have seen in previous articles with the new likes_file data set so that the person names are spelled out. The joined results will have the record type:
Record_Named_Like := RECORD
SourceLastName := members_file.LastName;
SourceFirstName := members_file.FirstName;
Record_Like;
TargetLastName := members_file.LastName;
TargetFirstName := members_file.FirstName;
END;
Essentially, Record_Named_Like extends Record_Like with 4 additional fields. The data types for the SourceLastName, SourceFirstName, TargetLastName and TargetFirstName fields are the same as LastName or FirstName from the members_file data set.
First, we are going to join the 2 data sets so that Record_Member.Id is Record_Like.SourceId. We'll copy the last and first name from the corresponding Record_Member record from likes_file to a new Record_Named_Like record.
Record_Named_Like JoinMembersAndLikes(Record_Member L, Record_Like R) := TRANSFORM
SELF.SourceLastName := L.LastName;
SELF.SourceFirstName := L.FirstName;
SELF.TargetLastName := '';
SELF.TargetFirstName := '';
SELF := R;
END;
member_likes_file := JOIN(members_file, likes_file, LEFT.Id=RIGHT.SourceId, JoinMembersAndLikes(LEFT, RIGHT));
The JOIN action joins the two data sets based on the condition LEFT.Id=RIGHT.SourceId where LEFT refers to members_file and RIGHT refers to likes_file. During joining, the transform function JoinMemberAndLikes is called to produce the resulting joint record of type Record_Named_Like.
We do another similar join such that Record_Member.Id is equal to Record_Named_Like.TargetId to identify the name of the liked person.
Record_Named_Like JoinLikesAndMembers(Record_Member L, Record_Named_Like R) := TRANSFORM
SELF.TargetLastName := L.LastName;
SELF.TargetFirstName := L.FirstName;
SELF := R;
END;
member_likes_member_file := JOIN(members_file, member_likes_file, LEFT.Id = RIGHT.TargetId, JoinLikesAndMembers(LEFT, RIGHT));
After both joins, the final data set member_likes_member_file looks like:
SourceLastName SourceFirstName SourceId Target TargetLastName TargetFirstName
Guinan 12 1 Picard Jean-Luc
Picard Jean-Luc 1 2 Riker William
...
Now, we can easily spot who likes whom by name. Let's go a bit further by doing some summary statistics and reporting. We want to generate a report that tells us how many likes there are for each person and of those likes how many are for Data and Worf. Precisely, the record type for the report is:
Record_Likes_Stats := RECORD
LastName := member_likes_member_file.SourceLastName;
FirstName := member_likes_member_file.SourceFirstName;
LikesData := COUNT(GROUP, member_likes_member_file.TargetLastName = 'Data');
LikesWorf := COUNT(GROUP, member_likes_member_file.TargetLastName = 'Worf');
TotalLikes := COUNT(GROUP);
END;
In SQL parlance, the first 2 fields (LastName and FirstName) in Record_Likes_Stats are the keys for our upcoming group-by query while the remaining fields (LikesData, LikesWorf and TotalLikes) are the aggregate functions performed on each group. For example, TotalLikes contains the count for each unique group of last and first name. LikesData contains the portion of the group count satisfying the filter condition of TargetLastName = 'Data'.
The group-by query is done by the TABLE action:
member_likes_stats := TABLE(member_likes_member_file, Record_Likes_Stats, SourceLastName, SourceFirstName);
The parameters to the TABLE action are the input data set, the returning record type and the group-by keys. Here is what the output looks like if you output member_likes_stats:
LastName FirstName LikesData LikesWorf TotalLikes
Crusher Beverly 0 0 2
Crusher Wesley 1 0 3
...
To limit the output to only those who like Data or Worf, you can add a filter to the OUTPUT action:
OUTPUT(member_likes_stats(LikesData > 0 OR LikesWorf > 0));
In this article, I have introduced two new ECL actions which are JOIN and TABLE. There are other ways of using these actions which I haven't described. To learn more, you can check out the ECL Language Reference.
No comments:
Post a Comment