
Dataset Name: brain_blmcf_ext_all_differences
Group: sec_filing
Vendor: Brain
Asset Class: Equity
Data Update Time(s): 7:05 AM EST
Data Update Frequency: day
The Brain Language Metrics on Company Filings (BLMCF) monitors several language metrics on 10-Ks and 10-Qs for 6000+ US stocks. This EXTENDED version provides additional language metrics covering not only the report as a whole but also for specific report sections (e.g. Risk Factors and MD&A sections).
Data Contained in this Dataset
Column | Type | Description |
---|---|---|
_seq | uint | Internal sequence number used to keep data rows in order |
timestamp | string | Timestamp of the Data - America/New York Time. |
muts | uint64 | Microseconds Unix Timestamp. An integer representation of a timestamp with microsecond precision that can be compared directly to other timestamps. |
symbol | string | Trading Symbol or Ticker |
COMPOSITE_FIGI | string | The FIGI composite code (https://www.openfigi.com) that identifies the stock across related exchanges in the same country. |
DATE | string | Date |
LAST_REPORT_DATE | string | The date of last report (with respect to DATE) issued by the company in YYYY-MM-DD format |
LAST_REPORT_CATEGORY | string | Category of last report (with respect to DATE) issued by the company. |
LAST_REPORT_PERIOD | uint | The period of the last available report. For 10-K annual reports this is an integer number labelling the |
PREV_REPORT_DATE | string | Category of previous report (with respect to LAST_REPORT_DATE) issued by the company.10-K or |
PREV_REPORT_CATEGORY | string | The date of previous report (with respect to LAST_REPORT_DATE) issued by the company in YYYY-MM-DD format. |
PREV_REPORT_PERIOD | uint | The period of the last available report. For 10-K annual reports this is an integer number labelling the annual reports. For 10-Q quarterly report this a integer number from 1 to 3 labelling the period report. This is used to perform differences between r |
DELTA_PERC_N_SENTENCES | double | Percentage change of the number of sentences between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -inf, +inf |
DELTA_PERC_MEAN_SENTENCE_LENGTH | double | Percentage change of sentence length (mean number of words per sentence) between the last available |
DELTA_SENTIMENT | double | The difference of financial sentiment between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -2.0 - +2.0 |
DELTA_SCORE_UNCERTAINTY | double | The difference of percentage of financial domain “uncertainty” language between the last available |
DELTA_SCORE_LITIGIOUS | double | The difference of percentage of financial domain “litigious” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
DELTA_SCORE_CONSTRAINING | double | The difference of percentage of financial domain “constraining” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
DELTA_SCORE_INTERESTING | double | The difference of percentage of financial domain “interesting” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
DELTA_READABILITY | double | The difference of the readability metric between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -inf, +inf |
DELTA_LEXICAL_RICHNESS | double | The difference of the lexical richness metric between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -1.0 to 1.0 |
DELTA_LEXICAL_DENSITY | double | The difference of the lexical density metric between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -1.0 to 1.0 |
DELTA_SPECIFIC_DENSITY | double | The difference of the specific density metric between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). -1.0 to 1.0 |
SIMILARITY_ALL | double | The language similarity between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
SIMILARITY_POSITIVE | double | The similarity in terms of financial domain “positive” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
SIMILARITY_NEGATIVE | double | The similarity in terms of financial domain “negative” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
SIMILARITY_UNCERTAINTY | double | The similarity in terms of financial domain “uncertainty” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
SIMILARITY_LITIGIOUS | double | The similarity in terms of financial domain “litigious” language between the last available report LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
SIMILARITY_CONSTRAINING | double | The similarity in terms of financial domain “constraining” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
SIMILARITY_INTERESTING | double | The similarity in terms of financial domain “interesting” language between the last available report (LAST_REPORT_DATE) and the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
RF_DELTA_PERC_N_SENTENCES | double | Percentage change of the number of sentences between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -inf, +inf |
RF_DELTA_PERC_MEAN_SENTENCE_LENGTH | double | Percentage change of mean sentence length between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category. -inf,+inf |
RF_DELTA_SENTIMENT | double | The difference of financial sentiment between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE).-2.0 to +2.0 |
RF_DELTA_SCORE_UNCERTAINTY | double | The difference of percentage of financial domain “uncertainty” language between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to |
RF_DELTA_SCORE_LITIGIOUS | double | The difference of percentage of financial domain “litigious” language between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1 |
RF_DELTA_SCORE_CONSTRAINING | double | The difference of percentage of financial domain “constraining” language between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to |
RF_DELTA_SCORE_INTERESTING | double | The difference of percentage of financial domain “interesting” language between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to |
RF_DELTA_READABILITY | double | The difference of the readability metric between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report ofsame period and category (PREV_REPORT_DATE). -inf, +inf |
RF_DELTA_LEXICAL_RICHNESS | double | The difference of the lexical richness metric between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
RF_DELTA_LEXICAL_DENSITY | double | The difference of the lexical density metric between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |
RF_DELTA_SPECIFIC_DENSITY | double | The difference of the specific density metric between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |
RF_SIMILARITY_ALL | double | The language similarity between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
RF_SIMILARITY_POSITIVE | double | The similarity in terms of financial domain “positive” language between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
RF_SIMILARITY_NEGATIVE | double | The similarity in terms of financial domain “negative” language between the “Risk Factors” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). 0.0 - 1.0 |
MD_DELTA_PERC_N_SENTENCES | double | Percentage change of the number of sentences between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -inf, +inf |
MD_DELTA_PERC_MEAN_SENTENCE_LENGTH | double | Percentage change of mean sentence length between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |
MD_DELTA_SENTIMENT | double | The difference of financial sentiment between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE) -2.0 to +2.0 |
MD_DELTA_SCORE_UNCERTAINTY | double | The difference of percentage of financial domain “uncertainty” language between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
MD_DELTA_SCORE_LITIGIOUS | double | The difference of percentage of financial domain “litigious” language between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
MD_DELTA_SCORE_CONSTRAINING | double | The difference of percentage of financial domain “constraining” language between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
MD_DELTA_SCORE_INTERESTING | double | The difference of percentage of financial domain “interesting” language between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -1.0 to +1.0 |
MD_DELTA_READABILITY | double | The difference of the readability metric between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). -inf to +inf |
MD_DELTA_LEXICAL_RICHNESS | double | The difference of the lexical richness metric between the “MD&A” section of the last available report and the same section of the previous report of the same period and category -1.0 to +1.0 |
MD_DELTA_LEXICAL_DENSITY | double | The difference of the lexical density metric between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |
MD_DELTA_SPECIFIC_DENSITY | double | The difference of the specific density metric between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |
MD_SIMILARITY_ALL | double | The language similarity between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category (PREV_REPORT_DATE). 0.0 to 1.0 |
MD_SIMILARITY_POSITIVE | double | The similarity in terms of financial domain “positive” language between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |
MD_SIMILARITY_NEGATIVE | double | The similarity in terms of financial domain “negative” language between the “MD&A” section of the last available report (LAST_REPORT_DATE) and the same section of the previous report of same period and category |