Hello @kevin ,
I did a rigorous debugging and here are my findings.
The query executed on the DB -
2013-04-13 13:46:23 EDT LOG: execute : WITH cte_docs AS ( SELECT id FROM hts_document WHERE ver_from_instant <= $1 AND ver_to_instant > $2 AND corr_from_instant <= $3 AND corr_to_instant > $4 AND id IN ( SELECT DISTINCT doc_id FROM hts_doc2idkey, hts_document WHERE hts_doc2idkey.doc_id = hts_document.id AND hts_document.ver_from_instant <= $5 AND hts_document.ver_to_instant > $6 AND hts_document.corr_from_instant <= $7 AND hts_document.corr_to_instant > $8 AND idkey_id IN ( SELECT id FROM hts_idkey WHERE (key_scheme = $9 AND key_value = $10) OR (key_scheme = $11 AND key_value = $12) OR (key_scheme = $13 AND key_value = $14) OR (key_scheme = $15 AND key_value = $16) OR (key_scheme = $17 AND key_value = $18) OR (key_scheme = $19 AND key_value = $20) ) ) ORDER BY id ) SELECT main.id AS doc_id, main.oid AS doc_oid, main.ver_from_instant AS ver_from_instant, main.ver_to_instant AS ver_to_instant, main.corr_from_instant AS corr_from_instant, main.corr_to_instant AS corr_to_instant, nm.name AS name, df.name AS data_field, ds.name AS data_source, dp.name AS data_provider, ot.name AS observation_time, i.key_scheme AS key_scheme, i.key_value AS key_value, di.valid_from AS key_valid_from, di.valid_to AS key_valid_to FROM hts_document main INNER JOIN cte_docs ON main.id = cte_docs.id INNER JOIN hts_name nm ON (main.name_id = nm.id) INNER JOIN hts_data_field df ON (main.data_field_id = df.id) INNER JOIN hts_data_source ds ON (main.data_source_id = ds.id) INNER JOIN hts_data_provider dp ON (main.data_provider_id = dp.id) INNER JOIN hts_observation_time ot ON (main.observation_time_id = ot.id) LEFT JOIN hts_doc2idkey di ON (di.doc_id = main.id) LEFT JOIN hts_idkey i ON (di.idkey_id = i.id) ORDER BY main.id
2013-04-13 13:46:23 EDT DETAIL: parameters: $1 = ‘2013-04-13 17:46:23.13104’, $2 = ‘2013-04-13 17:46:23.13104’, $3 = ‘2013-04-13 17:46:23.13104’, $4 = ‘2013-04-13 17:46:23.13104’, $5 = ‘2013-04-13 17:46:23.13104’, $6 = ‘2013-04-13 17:46:23.13104’, $7 = ‘2013-04-13 17:46:23.13104’, $8 = ‘2013-04-13 17:46:23.13104’, $9 = ‘CUSIP’, $10 = ‘360271100’, $11 = ‘FREQUENCY’, $12 = ‘MONTHLY’, $13 = ‘ISIN’, $14 = ‘US3602711000’, $15 = ‘permno0’, $16 = ‘11658’, $17 = ‘permno1’, $18 = ‘ABC11658’, $19 = ‘ticker’, $20 = ‘FULT’
Its output as text file is around 400 MB.
Now the real issue is not anywhere here.
The JSON made for the call is around 91 MB.
And the browser is somehow not able to parse this JSON.
Looking into the JSON , there are 4 Lakh+ entires in the related field.
Further looking into it
grep object_id bad.json | wc -l
416043
grep object_id bad.json | sort | uniq | wc -l
138681
This means there are atleast 3 duplicates for all entries.
grep -A5 ‘DbHts~174301’ bad.json
,{ “object_id”: “DbHts~174301”,
“data_field”:“ProfitMargin”,
“data_source”:“BLOOMBERG”,
“data_provider”:“BLOOMBERG”,
“observation_time”:“CLOSE”
}
,{ "object_id": "DbHts~174301",
"data_field":"ProfitMargin",
"data_source":"BLOOMBERG",
"data_provider":"BLOOMBERG",
"observation_time":"CLOSE"
}
–
,{ “object_id”: “DbHts~174301”,
“data_field”:“ProfitMargin”,
“data_source”:“BLOOMBERG”,
“data_provider”:“BLOOMBERG”,
“observation_time”:“CLOSE”
}
Now coming to the obvious question. There are 121 related fields for this timeseries. But why on earth is it trying to get all these object ID’s ?
Also by the looks of it , i guess its dumping all the timeseries realted information into the JSON.
Hope these information were useful. I would be only glad to give you more info on the same.
Thanks
Vineeth