Announce something here

brain_blmcf_ext_all_metrics

brain_blmcf_ext_all_metrics by Brain

Dataset Name: brain_blmcf_ext_all_metrics


Group: sec_filing
Vendor: Brain
Asset Class: Equity
Data Update Time(s): 7:05 AM EST
Data Update Frequency: day

The Brain Language Metrics on Company Filings (BLMCF) monitors several language metrics on 10-Ks and 10-Qs for 5000+ US stocks. Most recent 10-K/Q

Data Contained in this Dataset

Column Type Description
_seq uint Internal sequence number used to keep data rows in order
timestamp string Timestamp of the Data - America/New York Time.
muts uint64 Microseconds Unix Timestamp. An integer representation of a timestamp with microsecond precision that can be compared directly to other timestamps.
symbol string Trading Symbol or Ticker
COMPOSITE_FIGI string The FIGI composite code (https://www.openfigi.com) that identifies the stock across related exchanges in the same country.
DATE string Date
LAST_REPORT_CATEGORY string Category of last report (with respect to DATE) issued by the company.
LAST_REPORT_DATE string The date of last report (with respect to DATE) issued by the company in YYYY-MM-DD format
N_SENTENCES uint Number of sentences extracted from the last available report. 1 - inf
MEAN_SENTENCE_LENGTH double The mean sentence length measured in terms of the mean number of words per sentence for the last available report. 1 - inf
SENTIMENT double The financial sentiment of the last available report. +1 to -1
SCORE_UNCERTAINTY double The percentage of financial domain “uncertainty” language for present in the last report. 0.0 - 1.0
SCORE_LITIGIOUS double The percentage of financial domain “litigious” language for present in the last report. 0.0 - 1.0
SCORE_CONSTRAINING double The percentage of financial domain “constraining” language present in the last report. 0.0 - 1.0
SCORE_INTERESTING double The percentage of financial domain “interesting” language present in the last report. 0.0 - 1.0
READABILITY double Reading grade level for the the report expressed by a number corresponding to US education grade. The score is obtained from the average of various readability tests to measure how difficult is the text to understand (e.g. Gunning Fog Index). 0,inf
LEXICAL_RICHNESS double Lexical richness measured in terms of the Type-Token Ratio (TTR) which calculates the number of types (total number of words) divided by the number of tokens (number of unique words). The basic logic behind this measure is that if the text is more complex
LEXICAL_DENSITY double Lexical density to measure the text complexity by computing the ratio between number of lexical words (nouns, adjectives, lexical verbs, adverbs) divided by the total number of words in the document. 0.0 - 1.0
SPECIFIC_DENSITY double Percentage of words belonging to the specific dictionary used for company filings analysis present in the last available report. 0.0 - 1.0
RF_N_SENTENCES double Number of sentences extracted from the section “Risk Factors” of the last available report.
RF_MEAN_SENTENCE_LENGTH double The mean sentence length measured in terms of the mean number of words per sentence for the section “Risk Factors” of the last available report. 1,inf
RF_SENTIMENT double The financial sentiment for the section “Risk Factors” of the last available report. -1.0 to +1.0
RF_SCORE_UNCERTAINTY double The percentage of financial domain “uncertainty” language present in the section “Risk Factors” of the last
RF_SCORE_LITIGIOUS double The percentage of financial domain “litigious” language present in the section “Risk Factors” of the last
RF_SCORE_CONSTRAINING double The percentage of financial domain “constraining” language present in the section “Risk Factors” of the last report. 0.0 - 1.0
RF_SCORE_INTERESTING double The percentage of financial domain “interesting” language present in the section “Risk Factors” of the last
RF_READABILITY double Reading grade level for the section “Risk Factors” of the last available report. 1,inf
RF_LEXICAL_RICHNESS double Lexical richness for the section “Risk Factors” of the last available report. 0.0 - 1.0
RF_LEXICAL_DENSITY double Lexical density for the section “Risk Factors” of the last available report. 0.0 - 1.0
RF_SPECIFIC_DENSITY double Percentage of words belonging to the specific dictionary used for company filings analysis present in the section “Risk factors” of the last available report. 0.0 - 1.0
MD_N_SENTENCES double Number of sentences extracted from the “MD&A” sections of the last available report. 1,inf
MD_MEAN_SENTENCE_LENGTH double The mean sentence length measured in terms of the mean number of words per sentence for the “MD&A”
MD_SENTIMENT double The financial sentiment for the “MD&A” sections of the last available report. -1.0 to +1.0
MD_SCORE_UNCERTAINTY double The percentage of financial domain “uncertainty” language present in the “MD&A” sections of the last report. 0.0 - 1.0
MD_SCORE_LITIGIOUS double The percentage of financial domain “litigious” language present in the “MD&A” sections of the last report. 0.0 - 1.0
MD_SCORE_CONSTRAINING double The percentage of financial domain “constraining” language present in the “MD&A” sections of the last
MD_SCORE_INTERESTING double The percentage of financial domain “interesting” language present in the “MD&A” sections of the last report. 0.0 - 1.0
MD_READABILITY double Reading grade level for the “MD&A” sections of the last available report. 0,inf
MD_LEXICAL_RICHNESS double Lexical richness for the “MD&A” sections of the last available report. 0.0 - 1.0
MD_LEXICAL_DENSITY double Lexical density for the “MD&A” sections of the last available report. 0.0 - 1.0
MD_SPECIFIC_DENSITY double Percentage of words belonging to the specific dictionary used for company filings analysis present in the


Important Dataset Notes

This dataset is not available for direct online purchase. Please contact sales directly at sales@CloudQuant.com. The data is available through our normal sales department who can provide you with current pricing and a quote for accessing this valuable dataset. This may be due to a number of reasons such as dataset intended use, size of the company (or investment fund) using the dataset, or for simple legal requirements that CloudQuant needs to ensure are in place prior to licensing the dataset to you.