With custom expressions in Spotfire, you have more control on how the data is to be handled and displayed. These expressions are also useful when managing calculated values and calculated columns.
One common thing you may want to do, is the equivalent of the SQL statement:
SELECT column1, ..., columnN, COUNT(columnX) FROM table GROUP BY column1, ..., columnN
To count how many rows have a particular value in columnX, grouping the results by the unique values in column1, ..., columnN
In Spotfire, this can be accomplished with the OVER expression:
Count([columnX]) OVER ([column1], ..., [columnN])
If you need a COUNT(DISTINCT(columnX)), you can use UniqueCount([columnX]) instead.
hi stefano, i'm looking for a custom expression as countfis in excel, but i need to return the number of occurrences per each row.
ReplyDeletemore specifically, I have 27 columns and i need to return the number of occurrences >=0 and <1 for each row.
could you help?
thanks!
tiziana
Hi Tiziana,
DeleteI haven't used Spotfire in about a year so I'm not remembering everything correctly but this is already not easy in SQL itself; it requires you to unpivot your rows and consider them one at a time.
Assume you have this table (Oracle DB syntax):
CREATE TABLE test( id NUMBER
,col1 NUMBER
,col2 NUMBER
,col3 NUMBER
);
And put some data into it:
INSERT ALL
INTO test (id, col1, col2, col3) VALUES (1, 0.2, 0.3, 1)
INTO test (id, col1, col2, col3) VALUES (2, 3, 4, 1)
INTO test (id, col1, col2, col3) VALUES (3, 0, 1, 0)
SELECT 1 FROM DUAL;
Now you can simulate countfis as:
SELECT id
,SUM(CASE WHEN val >= 0 AND val < 1 THEN 1 ELSE 0 END) AS cnt
FROM test
UNPIVOT(val FOR col_name IN (col1, col2, col3)) u
GROUP BY id
;
This will give you as output a row for each id with the count of items >=0 and < 1.
Can you translate this into a single Spotfire expression? I am unsure if you can unpivot data on the fly in the expression; if you cannot then you can still create an intermediate table with the unpivoted data and then run the expression on it. Basically you want your table to look like (keep the sample from before in mind):
ID, COL
id1, col1
id1, col2
id1, col3
id2, col1
id2, col2
id2, col3
.
.
.
Then you can easily sum(if([col] >=0 and [col] < 1, 1, 0)) over ([id]) or something similar.
Last resort you can always go with an IronPython script looping over all columns and rows to produce the result you want.
Cheers