08/02/2014

[TIBCO Spotfire] Count grouping by columns

With custom expressions in Spotfire, you have more control on how the data is to be handled and displayed. These expressions are also useful when managing calculated values and calculated columns.

One common thing you may want to do, is the equivalent of the SQL statement:

SELECT column1, ..., columnN, COUNT(columnX) FROM table GROUP BY column1, ..., columnN

To count how many rows have a particular value in columnX, grouping the results by the unique values in column1, ..., columnN

In Spotfire, this can be accomplished with the OVER expression:

Count([columnX]) OVER ([column1], ..., [columnN])

If you need a COUNT(DISTINCT(columnX)), you can use UniqueCount([columnX]) instead.

2 comments:

  1. hi stefano, i'm looking for a custom expression as countfis in excel, but i need to return the number of occurrences per each row.
    more specifically, I have 27 columns and i need to return the number of occurrences >=0 and <1 for each row.
    could you help?
    thanks!
    tiziana

    ReplyDelete
    Replies
    1. Hi Tiziana,

      I haven't used Spotfire in about a year so I'm not remembering everything correctly but this is already not easy in SQL itself; it requires you to unpivot your rows and consider them one at a time.

      Assume you have this table (Oracle DB syntax):

      CREATE TABLE test( id NUMBER
      ,col1 NUMBER
      ,col2 NUMBER
      ,col3 NUMBER
      );

      And put some data into it:

      INSERT ALL
      INTO test (id, col1, col2, col3) VALUES (1, 0.2, 0.3, 1)
      INTO test (id, col1, col2, col3) VALUES (2, 3, 4, 1)
      INTO test (id, col1, col2, col3) VALUES (3, 0, 1, 0)
      SELECT 1 FROM DUAL;

      Now you can simulate countfis as:

      SELECT id
      ,SUM(CASE WHEN val >= 0 AND val < 1 THEN 1 ELSE 0 END) AS cnt
      FROM test
      UNPIVOT(val FOR col_name IN (col1, col2, col3)) u
      GROUP BY id
      ;

      This will give you as output a row for each id with the count of items >=0 and < 1.

      Can you translate this into a single Spotfire expression? I am unsure if you can unpivot data on the fly in the expression; if you cannot then you can still create an intermediate table with the unpivoted data and then run the expression on it. Basically you want your table to look like (keep the sample from before in mind):

      ID, COL
      id1, col1
      id1, col2
      id1, col3
      id2, col1
      id2, col2
      id2, col3
      .
      .
      .

      Then you can easily sum(if([col] >=0 and [col] < 1, 1, 0)) over ([id]) or something similar.

      Last resort you can always go with an IronPython script looping over all columns and rows to produce the result you want.

      Cheers

      Delete

With great power comes great responsibility