Exploratory Data Analysis
Checking if data needs to be manipulated:
summary(thyroid_df)
## subject_id hadm_id gender dob
## Min. : 3 Min. :100001 F:9450 2078-12-05 00:00:00: 40
## 1st Qu.:12076 1st Qu.:124705 M:8682 2052-02-14 00:00:00: 28
## Median :24633 Median :149789 2130-07-08 00:00:00: 21
## Mean :34134 Mean :149679 2066-10-13 00:00:00: 18
## 3rd Qu.:55085 3rd Qu.:174662 2071-06-27 00:00:00: 18
## Max. :99957 Max. :199993 2104-04-12 00:00:00: 17
## (Other) :17990
## itemid value valuenum valueuom
## Min. :50991 1.1 : 782 1.1 : 782 uIU/mL :10803
## 1st Qu.:50993 1.2 : 687 1.2 : 688 ng/dL : 3747
## Median :50993 1.3 : 609 1.3 : 609 ug/dL : 1555
## Mean :50994 1.4 : 535 1.4 : 535 uU/ML : 1259
## 3rd Qu.:50994 1.5 : 502 1.5 : 502 ng/dl : 336
## Max. :51001 1.0 : 428 1 : 428 uG/DL : 188
## (Other):14589 (Other):14588 (Other): 244
## charttime label
## 2129-03-12 13:40:00: 6 Thyroglobulin : 47
## 2153-04-15 21:25:00: 5 Thyroid Peroxidase Antibodies: 65
## 2160-01-30 15:50:00: 5 Thyroid Stimulating Hormone :12063
## 2195-10-20 03:21:00: 5 Thyroxine (T4) : 1743
## 2104-05-22 20:31:00: 4 Thyroxine (T4), Free : 3031
## 2105-06-18 16:15:00: 4 Triiodothyronine (T3) : 1183
## (Other) :18103
## fluid
## Blood:18132
##
##
##
##
##
##
## icd9_list
## 20280,20280,20280,20280,5845,5845,5845,5845,3363,3363,3363,3363,41519,41519,41519,41519,57400,57400,57400,57400,6822,6822,6822,6822,1363,1363,1363,1363,03843,03843,03843,03843,99591,99591,99591,99591,59080,59080,59080,59080,34830,34830,34830,34830,3220,3220,3220,3220,1125,1125,1125,1125,2841,2841,2841,2841,5781,5781,5781,5781,99811,99811,99811,99811,43491,43491,43491,43491,591,591,591,591,2761,2761,2761,2761,2930,2930,2930,2930,5848,5848,5848,5848,28800,28800,28800,28800,52809,52809,52809,52809,78061,78061,78061,78061,E9331,E9331,E9331,E9331,99931,99931,99931,99931,04112,04112,04112,04112,E8798,E8798,E8798,E8798,V6441,V6441,V6441,V6441,78559,78559,78559,78559,E8786,E8786,E8786,E8786,E9342,E9342,E9342,E9342,5933,5933,5933,5933,94224,94224,94224,94224,E8792,E8792,E8792,E8792,70705,70705,70705,70705,70722,70722,70722,70722,40390,40390,40390,40390,27542,27542,27542,27542: 6
## 24291,2768,28529 : 6
## 34510,5070,99731,51881,7802,E9380,24200,30423,V1581,V642,3051 : 6
## 41071,42731,00845,27651,2761,44024,70714,138,2720,4019,53081,2449,412,25000 : 6
## 4414,4414,4414,4233,4233,4233,42822,42822,42822,4239,4239,4239,2449,2449,2449,2948,2948,2948,45829,45829,45829,42789,42789,42789,4148,4148,4148,4019,4019,4019,3051,3051,3051,2809,2809,2809,41401,41401,41401,4280,4280,4280,4928,4928,4928,4439,4439,4439,412,412,412,V1581,V1581,V1581,V1301,V1301,V1301,V1083,V1083,V1083 : 6
## 4870,41071,4280,42832,27651,5859,42731,2724,4019,25050,36201,25040,58381,V4581,51889,41401,2411,2270 : 6
## (Other) :18096
## icd9_short_list
## Abdom aortic aneurysm,Abdom aortic aneurysm,Abdom aortic aneurysm,Cardiac tamponade,Cardiac tamponade,Cardiac tamponade,Chr systolic hrt failure,Chr systolic hrt failure,Chr systolic hrt failure,Pericardial disease NOS,Pericardial disease NOS,Pericardial disease NOS,Hypothyroidism NOS,Hypothyroidism NOS,Hypothyroidism NOS,Mental disor NEC oth dis,Mental disor NEC oth dis,Mental disor NEC oth dis,Iatrogenc hypotnsion NEC,Iatrogenc hypotnsion NEC,Iatrogenc hypotnsion NEC,Cardiac dysrhythmias NEC,Cardiac dysrhythmias NEC,Cardiac dysrhythmias NEC,Chr ischemic hrt dis NEC,Chr ischemic hrt dis NEC,Chr ischemic hrt dis NEC,Hypertension NOS,Hypertension NOS,Hypertension NOS,Tobacco use disorder,Tobacco use disorder,Tobacco use disorder,Iron defic anemia NOS,Iron defic anemia NOS,Iron defic anemia NOS,Crnry athrscl natve vssl,Crnry athrscl natve vssl,Crnry athrscl natve vssl,CHF NOS,CHF NOS,CHF NOS,Emphysema NEC,Emphysema NEC,Emphysema NEC,Periph vascular dis NOS,Periph vascular dis NOS,Periph vascular dis NOS,Old myocardial infarct,Old myocardial infarct,Old myocardial infarct,Hx of past noncompliance,Hx of past noncompliance,Hx of past noncompliance,Prsnl hst urnr dsrd calc,Prsnl hst urnr dsrd calc,Prsnl hst urnr dsrd calc,Hx-skin malignancy NEC,Hx-skin malignancy NEC,Hx-skin malignancy NEC : 6
## Gen cnv epil w/o intr ep,Food/vomit pneumonitis,Ventltr assoc pneumonia,Acute respiratry failure,Syncope and collapse,Adv eff cns muscl depres,Tox dif goiter no crisis,Cocaine depend-remiss,Hx of past noncompliance,No proc/patient decision,Tobacco use disorder : 6
## Influenza with pneumonia,Subendo infarct, initial,CHF NOS,Chr diastolic hrt fail,Dehydration,Chronic kidney dis NOS,Atrial fibrillation,Hyperlipidemia NEC/NOS,Hypertension NOS,DMII ophth nt st uncntrl,Diabetic retinopathy NOS,DMII renl nt st uncntrld,Nephritis NOS in oth dis,Aortocoronary bypass,Other lung disease NEC,Crnry athrscl natve vssl,Nontox multinodul goiter,Benign neoplasm adrenal : 6
## Oth lymp unsp xtrndl org,Oth lymp unsp xtrndl org,Oth lymp unsp xtrndl org,Oth lymp unsp xtrndl org,Ac kidny fail, tubr necr,Ac kidny fail, tubr necr,Ac kidny fail, tubr necr,Ac kidny fail, tubr necr,Myelopathy in oth dis,Myelopathy in oth dis,Myelopathy in oth dis,Myelopathy in oth dis,Pulm embol/infarct NEC,Pulm embol/infarct NEC,Pulm embol/infarct NEC,Pulm embol/infarct NEC,Cholelith w ac cholecyst,Cholelith w ac cholecyst,Cholelith w ac cholecyst,Cholelith w ac cholecyst,Cellulitis of trunk,Cellulitis of trunk,Cellulitis of trunk,Cellulitis of trunk,Pneumocystosis,Pneumocystosis,Pneumocystosis,Pneumocystosis,Pseudomonas septicemia,Pseudomonas septicemia,Pseudomonas septicemia,Pseudomonas septicemia,Sepsis,Sepsis,Sepsis,Sepsis,Pyelonephritis NOS,Pyelonephritis NOS,Pyelonephritis NOS,Pyelonephritis NOS,Encephalopathy NOS,Encephalopathy NOS,Encephalopathy NOS,Encephalopathy NOS,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Blood in stool,Blood in stool,Blood in stool,Blood in stool,Hemorrhage complic proc,Hemorrhage complic proc,Hemorrhage complic proc,Hemorrhage complic proc,Crbl art ocl NOS w infrc,Crbl art ocl NOS w infrc,Crbl art ocl NOS w infrc,Crbl art ocl NOS w infrc,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hyposmolality,Hyposmolality,Hyposmolality,Hyposmolality,Delirium d/t other cond,Delirium d/t other cond,Delirium d/t other cond,Delirium d/t other cond,Acute kidney failure NEC,Acute kidney failure NEC,Acute kidney failure NEC,Acute kidney failure NEC,Neutropenia NOS,Neutropenia NOS,Neutropenia NOS,Neutropenia NOS,Stomatits & mucosits NEC,Stomatits & mucosits NEC,Stomatits & mucosits NEC,Stomatits & mucosits NEC,Fever in other diseases,Fever in other diseases,Fever in other diseases,Fever in other diseases,Adv eff antineoplastic,Adv eff antineoplastic,Adv eff antineoplastic,Adv eff antineoplastic,Oth/uns inf-cen ven cath,Oth/uns inf-cen ven cath,Oth/uns inf-cen ven cath,Oth/uns inf-cen ven cath,MRSA elsewhere/NOS,MRSA elsewhere/NOS,MRSA elsewhere/NOS,MRSA elsewhere/NOS,Abn react-procedure NEC,Abn react-procedure NEC,Abn react-procedure NEC,Abn react-procedure NEC,Lap surg convert to open,Lap surg convert to open,Lap surg convert to open,Lap surg convert to open,Shock w/o trauma NEC,Shock w/o trauma NEC,Shock w/o trauma NEC,Shock w/o trauma NEC,Abn reac-organ rem NEC,Abn reac-organ rem NEC,Abn reac-organ rem NEC,Abn reac-organ rem NEC,Adv eff anticoagulants,Adv eff anticoagulants,Adv eff anticoagulants,Adv eff anticoagulants,Stricture of ureter,Stricture of ureter,Stricture of ureter,Stricture of ureter,2nd deg burn back,2nd deg burn back,2nd deg burn back,2nd deg burn back,Abn react-radiotherapy,Abn react-radiotherapy,Abn react-radiotherapy,Abn react-radiotherapy,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Hy kid NOS w cr kid I-IV,Hy kid NOS w cr kid I-IV,Hy kid NOS w cr kid I-IV,Hy kid NOS w cr kid I-IV,Hypercalcemia,Hypercalcemia,Hypercalcemia,Hypercalcemia: 6
## Subendo infarct, initial,Atrial fibrillation,Int inf clstrdium dfcile,Dehydration,Hyposmolality,Ath ext ntv art gngrene,Ulcer of heel & midfoot,Late effect acute polio,Pure hypercholesterolem,Hypertension NOS,Esophageal reflux,Hypothyroidism NOS,Old myocardial infarct,DMII wo cmp nt st uncntr : 6
## Thyrotox NOS w crisis,Hypopotassemia,Anemia-other chronic dis : 6
## (Other) :18096
## icd9_long_list
## Abdominal aneurysm without mention of rupture,Abdominal aneurysm without mention of rupture,Abdominal aneurysm without mention of rupture,Cardiac tamponade,Cardiac tamponade,Cardiac tamponade,Chronic systolic heart failure,Chronic systolic heart failure,Chronic systolic heart failure,Unspecified disease of pericardium,Unspecified disease of pericardium,Unspecified disease of pericardium,Unspecified acquired hypothyroidism,Unspecified acquired hypothyroidism,Unspecified acquired hypothyroidism,Other persistent mental disorders due to conditions classified elsewhere,Other persistent mental disorders due to conditions classified elsewhere,Other persistent mental disorders due to conditions classified elsewhere,Other iatrogenic hypotension,Other iatrogenic hypotension,Other iatrogenic hypotension,Other specified cardiac dysrhythmias,Other specified cardiac dysrhythmias,Other specified cardiac dysrhythmias,Other specified forms of chronic ischemic heart disease,Other specified forms of chronic ischemic heart disease,Other specified forms of chronic ischemic heart disease,Unspecified essential hypertension,Unspecified essential hypertension,Unspecified essential hypertension,Tobacco use disorder,Tobacco use disorder,Tobacco use disorder,Iron deficiency anemia, unspecified,Iron deficiency anemia, unspecified,Iron deficiency anemia, unspecified,Coronary atherosclerosis of native coronary artery,Coronary atherosclerosis of native coronary artery,Coronary atherosclerosis of native coronary artery,Congestive heart failure, unspecified,Congestive heart failure, unspecified,Congestive heart failure, unspecified,Other emphysema,Other emphysema,Other emphysema,Peripheral vascular disease, unspecified,Peripheral vascular disease, unspecified,Peripheral vascular disease, unspecified,Old myocardial infarction,Old myocardial infarction,Old myocardial infarction,Personal history of noncompliance with medical treatment, presenting hazards to health,Personal history of noncompliance with medical treatment, presenting hazards to health,Personal history of noncompliance with medical treatment, presenting hazards to health,Personal history of urinary calculi,Personal history of urinary calculi,Personal history of urinary calculi,Personal history of other malignant neoplasm of skin,Personal history of other malignant neoplasm of skin,Personal history of other malignant neoplasm of skin : 6
## Generalized convulsive epilepsy, without mention of intractable epilepsy,Pneumonitis due to inhalation of food or vomitus,Ventilator associated pneumonia,Acute respiratory failure,Syncope and collapse,Central nervous system muscle-tone depressants causing adverse effects in therapeutic use,Toxic diffuse goiter without mention of thyrotoxic crisis or storm,Cocaine dependence, in remission,Personal history of noncompliance with medical treatment, presenting hazards to health,Surgical or other procedure not carried out because of patient's decision,Tobacco use disorder : 6
## Influenza with pneumonia,Subendocardial infarction, initial episode of care,Congestive heart failure, unspecified,Chronic diastolic heart failure,Dehydration,Chronic kidney disease, unspecified,Atrial fibrillation,Other and unspecified hyperlipidemia,Unspecified essential hypertension,Diabetes with ophthalmic manifestations, type II or unspecified type, not stated as uncontrolled,Background diabetic retinopathy,Diabetes with renal manifestations, type II or unspecified type, not stated as uncontrolled,Nephritis and nephropathy, not specified as acute or chronic, in diseases classified elsewhere,Aortocoronary bypass status,Other diseases of lung, not elsewhere classified,Coronary atherosclerosis of native coronary artery,Nontoxic multinodular goiter,Benign neoplasm of adrenal gland : 6
## Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Acute kidney failure with lesion of tubular necrosis,Acute kidney failure with lesion of tubular necrosis,Acute kidney failure with lesion of tubular necrosis,Acute kidney failure with lesion of tubular necrosis,Myelopathy in other diseases classified elsewhere,Myelopathy in other diseases classified elsewhere,Myelopathy in other diseases classified elsewhere,Myelopathy in other diseases classified elsewhere,Other pulmonary embolism and infarction,Other pulmonary embolism and infarction,Other pulmonary embolism and infarction,Other pulmonary embolism and infarction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Cellulitis and abscess of trunk,Cellulitis and abscess of trunk,Cellulitis and abscess of trunk,Cellulitis and abscess of trunk,Pneumocystosis,Pneumocystosis,Pneumocystosis,Pneumocystosis,Septicemia due to pseudomonas,Septicemia due to pseudomonas,Septicemia due to pseudomonas,Septicemia due to pseudomonas,Sepsis,Sepsis,Sepsis,Sepsis,Pyelonephritis, unspecified,Pyelonephritis, unspecified,Pyelonephritis, unspecified,Pyelonephritis, unspecified,Encephalopathy, unspecified,Encephalopathy, unspecified,Encephalopathy, unspecified,Encephalopathy, unspecified,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Blood in stool,Blood in stool,Blood in stool,Blood in stool,Hemorrhage complicating a procedure,Hemorrhage complicating a procedure,Hemorrhage complicating a procedure,Hemorrhage complicating a procedure,Cerebral artery occlusion, unspecified with cerebral infarction,Cerebral artery occlusion, unspecified with cerebral infarction,Cerebral artery occlusion, unspecified with cerebral infarction,Cerebral artery occlusion, unspecified with cerebral infarction,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hyposmolality and/or hyponatremia,Hyposmolality and/or hyponatremia,Hyposmolality and/or hyponatremia,Hyposmolality and/or hyponatremia,Delirium due to conditions classified elsewhere,Delirium due to conditions classified elsewhere,Delirium due to conditions classified elsewhere,Delirium due to conditions classified elsewhere,Acute kidney failure with other specified pathological lesion in kidney,Acute kidney failure with other specified pathological lesion in kidney,Acute kidney failure with other specified pathological lesion in kidney,Acute kidney failure with other specified pathological lesion in kidney,Neutropenia, unspecified,Neutropenia, unspecified,Neutropenia, unspecified,Neutropenia, unspecified,Other stomatitis and mucositis (ulcerative),Other stomatitis and mucositis (ulcerative),Other stomatitis and mucositis (ulcerative),Other stomatitis and mucositis (ulcerative),Fever presenting with conditions classified elsewhere,Fever presenting with conditions classified elsewhere,Fever presenting with conditions classified elsewhere,Fever presenting with conditions classified elsewhere,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Other and unspecified infection due to central venous catheter,Other and unspecified infection due to central venous catheter,Other and unspecified infection due to central venous catheter,Other and unspecified infection due to central venous catheter,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Laparoscopic surgical procedure converted to open procedure,Laparoscopic surgical procedure converted to open procedure,Laparoscopic surgical procedure converted to open procedure,Laparoscopic surgical procedure converted to open procedure,Other shock without mention of trauma,Other shock without mention of trauma,Other shock without mention of trauma,Other shock without mention of trauma,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Anticoagulants causing adverse effects in therapeutic use,Anticoagulants causing adverse effects in therapeutic use,Anticoagulants causing adverse effects in therapeutic use,Anticoagulants causing adverse effects in therapeutic use,Stricture or kinking of ureter,Stricture or kinking of ureter,Stricture or kinking of ureter,Stricture or kinking of ureter,Blisters, epidermal loss [second degree] of back [any part],Blisters, epidermal loss [second degree] of back [any part],Blisters, epidermal loss [second degree] of back [any part],Blisters, epidermal loss [second degree] of back [any part],Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypercalcemia,Hypercalcemia,Hypercalcemia,Hypercalcemia: 6
## Other tracheostomy complications,Atrial fibrillation,Chronic respiratory failure,Dependence on respirator, status,Diastolic heart failure, unspecified,Congestive heart failure, unspecified,Pure hypercholesterolemia,Other diseases of trachea and bronchus,Personal history of malignant neoplasm of thyroid,Gastrostomy status,Unspecified acquired hypothyroidism,Other chronic pulmonary heart diseases,Anemia, unspecified : 6
## Other tracheostomy complications,Other tracheostomy complications,Congestive heart failure, unspecified,Congestive heart failure, unspecified,Chronic respiratory failure,Chronic respiratory failure,Chronic airway obstruction, not elsewhere classified,Chronic airway obstruction, not elsewhere classified,Other and unspecified angina pectoris,Other and unspecified angina pectoris,Urinary complications, not elsewhere classified,Urinary complications, not elsewhere classified,Urinary tract infection, site not specified,Urinary tract infection, site not specified,Infection and inflammatory reaction due to other vascular device, implant, and graft,Infection and inflammatory reaction due to other vascular device, implant, and graft,Bacteremia,Bacteremia,Methicillin susceptible pneumonia due to Staphylococcus aureus,Methicillin susceptible pneumonia due to Staphylococcus aureus,Hematoma complicating a procedure,Hematoma complicating a procedure,Other diseases of trachea and bronchus,Other diseases of trachea and bronchus,Nontoxic multinodular goiter,Nontoxic multinodular goiter,Other abnormal granulation tissue,Other abnormal granulation tissue,Morbid obesity,Morbid obesity,Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled,Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled,Coronary atherosclerosis of native coronary artery,Coronary atherosclerosis of native coronary artery,Friedl?nder's bacillus infection in conditions classified elsewhere and of unspecified site,Friedl?nder's bacillus infection in conditions classified elsewhere and of unspecified site,Proteus (mirabilis) (morganii) infection in conditions classified elsewhere and of unspecified site,Proteus (mirabilis) (morganii) infection in conditions classified elsewhere and of unspecified site,Methicillin susceptible Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin susceptible Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Infection with microorganisms resistant to penicillins,Infection with microorganisms resistant to penicillins,Ileostomy status,Ileostomy status : 6
## (Other) :18096
## drug_list
## NULL :13038
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 377
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 301
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 173
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium: 147
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 145
## (Other) : 3951
## drug_name
## NULL :13040
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 377
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 302
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 173
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium: 147
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 145
## (Other) : 3948
## generic_drug_name
## NULL :13040
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 377
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 301
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 173
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium: 146
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 145
## (Other) : 3950
## drug_type
## NULL :13038
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 378
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 305
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 177
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN: 151
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 146
## (Other) : 3937
## formulary
## NULL :13038
## LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25 : 82
## LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50 : 69
## LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75 : 60
## LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100: 48
## LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125: 45
## (Other) : 4790
## ndc
## NULL :13038
## 00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113 : 68
## 00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211 : 53
## 00074518211,00074518211,00074518211,00074518211,00074518211,00074518211,00074518211,00074518211,00074518211 : 44
## 00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211: 37
## 00074662411,00074662411,00074662411,00074662411,00074662411,00074662411,00074662411,00074662411,00074662411 : 36
## (Other) : 4856
## gsn
## NULL :13038
## 006648,006648,006648,006648,006648,006648,006648,006648,006648: 82
## 006649,006649,006649,006649,006649,006649,006649,006649,006649: 69
## 006650,006650,006650,006650,006650,006650,006650,006650,006650: 60
## 006651,006651,006651,006651,006651,006651,006651,006651,006651: 48
## 006653,006653,006653,006653,006653,006653,006653,006653,006653: 45
## (Other) : 4790
Checking for NAs:
colSums(is.na(thyroid_df))
## subject_id hadm_id gender dob
## 0 0 0 0
## itemid value valuenum valueuom
## 0 0 0 0
## charttime label fluid icd9_list
## 0 0 0 0
## icd9_short_list icd9_long_list drug_list drug_name
## 0 0 0 0
## generic_drug_name drug_type formulary ndc
## 0 0 0 0
## gsn
## 0
Converting variables to proper type:
# First converting values that contain "NULL" (187 total) in column "valuenum" to NA
for(k in 1:nrow(thyroid_df)){
if (thyroid_df$valuenum[k] == "NULL"){
thyroid_df$value[k] <- NA
}
}
# Then converting the string to numerics to get proper results otherwise when doing direct coercing to numeric, incorrect results were been displayed
thyroid_df$value <- destring(thyroid_df$value)
thyroid_df$value <- as.numeric(thyroid_df$value)
thyroid_df$gender <- as.factor(thyroid_df$gender)
thyroid_df$label <- as.factor(thyroid_df$label)
thyroid_df$valueuom <- as.factor(thyroid_df$valueuom)
str(thyroid_df)
## 'data.frame': 18132 obs. of 21 variables:
## $ subject_id : int 3 3 4 13 19 22 34 34 34 38 ...
## $ hadm_id : int 145834 145834 185777 143045 109235 165315 115799 144319 144319 185910 ...
## $ gender : Factor w/ 2 levels "F","M": 2 2 1 1 2 1 2 2 2 2 ...
## $ dob : Factor w/ 9850 levels "1800-07-02 00:00:00",..: 892 892 9338 8593 71 8824 592 592 592 5572 ...
## $ itemid : int 50993 50994 50993 50993 50993 50993 50993 50993 50995 50993 ...
## $ value : num 3.4 6.6 0.35 1.2 1.5 2.5 2.7 8.4 1.1 1.5 ...
## $ valuenum : Factor w/ 492 levels "0.02","0.021",..: 338 413 91 160 163 294 296 454 159 163 ...
## $ valueuom : Factor w/ 10 levels "IU/mL","ng/dl",..: 10 8 9 9 9 10 9 9 3 9 ...
## $ charttime : Factor w/ 14120 levels "2100-07-05 04:12:00",..: 176 176 12650 9255 1022 13342 11992 12637 12638 9191 ...
## $ label : Factor w/ 6 levels "Thyroglobulin",..: 3 4 3 3 3 3 3 3 5 3 ...
## $ fluid : Factor w/ 1 level "Blood": 1 1 1 1 1 1 1 1 1 1 ...
## $ icd9_list : Factor w/ 12155 levels "0031,51881,5566,2535,72888,5845,78552,5070,2762,9982,2851,00845,5770,570,1890,34839,42832,9972,99592,03819,3030"| __truncated__,..: 1069 1069 1125 4025 10477 11251 3726 5343 5343 9853 ...
## $ icd9_short_list : Factor w/ 12155 levels "35-36 comp wks gestation,NB obsrv suspct infect,Need prphyl vc vrl hepat,Single lb in-hosp w cs,Preterm NEC 250"| __truncated__,..: 10609 10609 6303 4553 5752 9231 11322 3106 3106 7454 ...
## $ icd9_long_list : Factor w/ 12155 levels "35-36 completed weeks of gestation,Observation for suspected infectious condition,Need for prophylactic vaccina"| __truncated__,..: 11620 11620 5593 4446 3752 9513 10717 8429 8429 12082 ...
## $ drug_list : Factor w/ 171 levels "Levothyroxine Sodium,Levothyroxine Sodium",..: 171 171 8 171 171 171 7 7 7 171 ...
## $ drug_name : Factor w/ 170 levels "Levothyroxine Sodium,Levothyroxine Sodium",..: 170 170 8 170 170 170 7 7 7 170 ...
## $ generic_drug_name: Factor w/ 200 levels "*NF* Levothyroxine Sodium Brand Name,Levothyroxine Sodium,*NF* Levothyroxine Sodium Brand Name,Levothyroxine So"| __truncated__,..: 200 200 9 200 200 200 8 8 8 200 ...
## $ drug_type : Factor w/ 148 levels "MAIN,MAIN","MAIN,MAIN,MAIN",..: 148 148 8 148 148 148 7 7 7 148 ...
## $ formulary : Factor w/ 1427 levels "CYTO5T,LEVO88,CYTO5T,CYTO5T,CYTO5T,LEVO88,CYTO5T,CYTO5T,CYTO5T,LEVO88,CYTO5T,CYTO5T,CYTO5T,LEVO88,CYTO5T,CYTO5T"| __truncated__,..: 1427 1427 908 1427 1427 1427 9 1284 1284 1427 ...
## $ ndc : Factor w/ 1519 levels "00007521006,00007521006,00007521006,00007521006,00007521006,55390088010,00007521006,55390088010,55390088010,553"| __truncated__,..: 1519 1519 109 1519 1519 1519 755 490 490 1519 ...
## $ gsn : Factor w/ 1426 levels "006645,006645,006645,006645,006645,006645",..: 1426 1426 322 1426 1426 1426 898 709 709 1426 ...
summary(thyroid_df)
## subject_id hadm_id gender dob
## Min. : 3 Min. :100001 F:9450 2078-12-05 00:00:00: 40
## 1st Qu.:12076 1st Qu.:124705 M:8682 2052-02-14 00:00:00: 28
## Median :24633 Median :149789 2130-07-08 00:00:00: 21
## Mean :34134 Mean :149679 2066-10-13 00:00:00: 18
## 3rd Qu.:55085 3rd Qu.:174662 2071-06-27 00:00:00: 18
## Max. :99957 Max. :199993 2104-04-12 00:00:00: 17
## (Other) :17990
## itemid value valuenum valueuom
## Min. :50991 Min. : 0.020 1.1 : 782 uIU/mL :10803
## 1st Qu.:50993 1st Qu.: 1.100 1.2 : 688 ng/dL : 3747
## Median :50993 Median : 2.000 1.3 : 609 ug/dL : 1555
## Mean :50994 Mean : 8.703 1.4 : 535 uU/ML : 1259
## 3rd Qu.:50994 3rd Qu.: 4.900 1.5 : 502 ng/dl : 336
## Max. :51001 Max. :3890.000 1 : 428 uG/DL : 188
## NA's :187 (Other):14588 (Other): 244
## charttime label
## 2129-03-12 13:40:00: 6 Thyroglobulin : 47
## 2153-04-15 21:25:00: 5 Thyroid Peroxidase Antibodies: 65
## 2160-01-30 15:50:00: 5 Thyroid Stimulating Hormone :12063
## 2195-10-20 03:21:00: 5 Thyroxine (T4) : 1743
## 2104-05-22 20:31:00: 4 Thyroxine (T4), Free : 3031
## 2105-06-18 16:15:00: 4 Triiodothyronine (T3) : 1183
## (Other) :18103
## fluid
## Blood:18132
##
##
##
##
##
##
## icd9_list
## 20280,20280,20280,20280,5845,5845,5845,5845,3363,3363,3363,3363,41519,41519,41519,41519,57400,57400,57400,57400,6822,6822,6822,6822,1363,1363,1363,1363,03843,03843,03843,03843,99591,99591,99591,99591,59080,59080,59080,59080,34830,34830,34830,34830,3220,3220,3220,3220,1125,1125,1125,1125,2841,2841,2841,2841,5781,5781,5781,5781,99811,99811,99811,99811,43491,43491,43491,43491,591,591,591,591,2761,2761,2761,2761,2930,2930,2930,2930,5848,5848,5848,5848,28800,28800,28800,28800,52809,52809,52809,52809,78061,78061,78061,78061,E9331,E9331,E9331,E9331,99931,99931,99931,99931,04112,04112,04112,04112,E8798,E8798,E8798,E8798,V6441,V6441,V6441,V6441,78559,78559,78559,78559,E8786,E8786,E8786,E8786,E9342,E9342,E9342,E9342,5933,5933,5933,5933,94224,94224,94224,94224,E8792,E8792,E8792,E8792,70705,70705,70705,70705,70722,70722,70722,70722,40390,40390,40390,40390,27542,27542,27542,27542: 6
## 24291,2768,28529 : 6
## 34510,5070,99731,51881,7802,E9380,24200,30423,V1581,V642,3051 : 6
## 41071,42731,00845,27651,2761,44024,70714,138,2720,4019,53081,2449,412,25000 : 6
## 4414,4414,4414,4233,4233,4233,42822,42822,42822,4239,4239,4239,2449,2449,2449,2948,2948,2948,45829,45829,45829,42789,42789,42789,4148,4148,4148,4019,4019,4019,3051,3051,3051,2809,2809,2809,41401,41401,41401,4280,4280,4280,4928,4928,4928,4439,4439,4439,412,412,412,V1581,V1581,V1581,V1301,V1301,V1301,V1083,V1083,V1083 : 6
## 4870,41071,4280,42832,27651,5859,42731,2724,4019,25050,36201,25040,58381,V4581,51889,41401,2411,2270 : 6
## (Other) :18096
## icd9_short_list
## Abdom aortic aneurysm,Abdom aortic aneurysm,Abdom aortic aneurysm,Cardiac tamponade,Cardiac tamponade,Cardiac tamponade,Chr systolic hrt failure,Chr systolic hrt failure,Chr systolic hrt failure,Pericardial disease NOS,Pericardial disease NOS,Pericardial disease NOS,Hypothyroidism NOS,Hypothyroidism NOS,Hypothyroidism NOS,Mental disor NEC oth dis,Mental disor NEC oth dis,Mental disor NEC oth dis,Iatrogenc hypotnsion NEC,Iatrogenc hypotnsion NEC,Iatrogenc hypotnsion NEC,Cardiac dysrhythmias NEC,Cardiac dysrhythmias NEC,Cardiac dysrhythmias NEC,Chr ischemic hrt dis NEC,Chr ischemic hrt dis NEC,Chr ischemic hrt dis NEC,Hypertension NOS,Hypertension NOS,Hypertension NOS,Tobacco use disorder,Tobacco use disorder,Tobacco use disorder,Iron defic anemia NOS,Iron defic anemia NOS,Iron defic anemia NOS,Crnry athrscl natve vssl,Crnry athrscl natve vssl,Crnry athrscl natve vssl,CHF NOS,CHF NOS,CHF NOS,Emphysema NEC,Emphysema NEC,Emphysema NEC,Periph vascular dis NOS,Periph vascular dis NOS,Periph vascular dis NOS,Old myocardial infarct,Old myocardial infarct,Old myocardial infarct,Hx of past noncompliance,Hx of past noncompliance,Hx of past noncompliance,Prsnl hst urnr dsrd calc,Prsnl hst urnr dsrd calc,Prsnl hst urnr dsrd calc,Hx-skin malignancy NEC,Hx-skin malignancy NEC,Hx-skin malignancy NEC : 6
## Gen cnv epil w/o intr ep,Food/vomit pneumonitis,Ventltr assoc pneumonia,Acute respiratry failure,Syncope and collapse,Adv eff cns muscl depres,Tox dif goiter no crisis,Cocaine depend-remiss,Hx of past noncompliance,No proc/patient decision,Tobacco use disorder : 6
## Influenza with pneumonia,Subendo infarct, initial,CHF NOS,Chr diastolic hrt fail,Dehydration,Chronic kidney dis NOS,Atrial fibrillation,Hyperlipidemia NEC/NOS,Hypertension NOS,DMII ophth nt st uncntrl,Diabetic retinopathy NOS,DMII renl nt st uncntrld,Nephritis NOS in oth dis,Aortocoronary bypass,Other lung disease NEC,Crnry athrscl natve vssl,Nontox multinodul goiter,Benign neoplasm adrenal : 6
## Oth lymp unsp xtrndl org,Oth lymp unsp xtrndl org,Oth lymp unsp xtrndl org,Oth lymp unsp xtrndl org,Ac kidny fail, tubr necr,Ac kidny fail, tubr necr,Ac kidny fail, tubr necr,Ac kidny fail, tubr necr,Myelopathy in oth dis,Myelopathy in oth dis,Myelopathy in oth dis,Myelopathy in oth dis,Pulm embol/infarct NEC,Pulm embol/infarct NEC,Pulm embol/infarct NEC,Pulm embol/infarct NEC,Cholelith w ac cholecyst,Cholelith w ac cholecyst,Cholelith w ac cholecyst,Cholelith w ac cholecyst,Cellulitis of trunk,Cellulitis of trunk,Cellulitis of trunk,Cellulitis of trunk,Pneumocystosis,Pneumocystosis,Pneumocystosis,Pneumocystosis,Pseudomonas septicemia,Pseudomonas septicemia,Pseudomonas septicemia,Pseudomonas septicemia,Sepsis,Sepsis,Sepsis,Sepsis,Pyelonephritis NOS,Pyelonephritis NOS,Pyelonephritis NOS,Pyelonephritis NOS,Encephalopathy NOS,Encephalopathy NOS,Encephalopathy NOS,Encephalopathy NOS,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Blood in stool,Blood in stool,Blood in stool,Blood in stool,Hemorrhage complic proc,Hemorrhage complic proc,Hemorrhage complic proc,Hemorrhage complic proc,Crbl art ocl NOS w infrc,Crbl art ocl NOS w infrc,Crbl art ocl NOS w infrc,Crbl art ocl NOS w infrc,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hyposmolality,Hyposmolality,Hyposmolality,Hyposmolality,Delirium d/t other cond,Delirium d/t other cond,Delirium d/t other cond,Delirium d/t other cond,Acute kidney failure NEC,Acute kidney failure NEC,Acute kidney failure NEC,Acute kidney failure NEC,Neutropenia NOS,Neutropenia NOS,Neutropenia NOS,Neutropenia NOS,Stomatits & mucosits NEC,Stomatits & mucosits NEC,Stomatits & mucosits NEC,Stomatits & mucosits NEC,Fever in other diseases,Fever in other diseases,Fever in other diseases,Fever in other diseases,Adv eff antineoplastic,Adv eff antineoplastic,Adv eff antineoplastic,Adv eff antineoplastic,Oth/uns inf-cen ven cath,Oth/uns inf-cen ven cath,Oth/uns inf-cen ven cath,Oth/uns inf-cen ven cath,MRSA elsewhere/NOS,MRSA elsewhere/NOS,MRSA elsewhere/NOS,MRSA elsewhere/NOS,Abn react-procedure NEC,Abn react-procedure NEC,Abn react-procedure NEC,Abn react-procedure NEC,Lap surg convert to open,Lap surg convert to open,Lap surg convert to open,Lap surg convert to open,Shock w/o trauma NEC,Shock w/o trauma NEC,Shock w/o trauma NEC,Shock w/o trauma NEC,Abn reac-organ rem NEC,Abn reac-organ rem NEC,Abn reac-organ rem NEC,Abn reac-organ rem NEC,Adv eff anticoagulants,Adv eff anticoagulants,Adv eff anticoagulants,Adv eff anticoagulants,Stricture of ureter,Stricture of ureter,Stricture of ureter,Stricture of ureter,2nd deg burn back,2nd deg burn back,2nd deg burn back,2nd deg burn back,Abn react-radiotherapy,Abn react-radiotherapy,Abn react-radiotherapy,Abn react-radiotherapy,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Hy kid NOS w cr kid I-IV,Hy kid NOS w cr kid I-IV,Hy kid NOS w cr kid I-IV,Hy kid NOS w cr kid I-IV,Hypercalcemia,Hypercalcemia,Hypercalcemia,Hypercalcemia: 6
## Subendo infarct, initial,Atrial fibrillation,Int inf clstrdium dfcile,Dehydration,Hyposmolality,Ath ext ntv art gngrene,Ulcer of heel & midfoot,Late effect acute polio,Pure hypercholesterolem,Hypertension NOS,Esophageal reflux,Hypothyroidism NOS,Old myocardial infarct,DMII wo cmp nt st uncntr : 6
## Thyrotox NOS w crisis,Hypopotassemia,Anemia-other chronic dis : 6
## (Other) :18096
## icd9_long_list
## Abdominal aneurysm without mention of rupture,Abdominal aneurysm without mention of rupture,Abdominal aneurysm without mention of rupture,Cardiac tamponade,Cardiac tamponade,Cardiac tamponade,Chronic systolic heart failure,Chronic systolic heart failure,Chronic systolic heart failure,Unspecified disease of pericardium,Unspecified disease of pericardium,Unspecified disease of pericardium,Unspecified acquired hypothyroidism,Unspecified acquired hypothyroidism,Unspecified acquired hypothyroidism,Other persistent mental disorders due to conditions classified elsewhere,Other persistent mental disorders due to conditions classified elsewhere,Other persistent mental disorders due to conditions classified elsewhere,Other iatrogenic hypotension,Other iatrogenic hypotension,Other iatrogenic hypotension,Other specified cardiac dysrhythmias,Other specified cardiac dysrhythmias,Other specified cardiac dysrhythmias,Other specified forms of chronic ischemic heart disease,Other specified forms of chronic ischemic heart disease,Other specified forms of chronic ischemic heart disease,Unspecified essential hypertension,Unspecified essential hypertension,Unspecified essential hypertension,Tobacco use disorder,Tobacco use disorder,Tobacco use disorder,Iron deficiency anemia, unspecified,Iron deficiency anemia, unspecified,Iron deficiency anemia, unspecified,Coronary atherosclerosis of native coronary artery,Coronary atherosclerosis of native coronary artery,Coronary atherosclerosis of native coronary artery,Congestive heart failure, unspecified,Congestive heart failure, unspecified,Congestive heart failure, unspecified,Other emphysema,Other emphysema,Other emphysema,Peripheral vascular disease, unspecified,Peripheral vascular disease, unspecified,Peripheral vascular disease, unspecified,Old myocardial infarction,Old myocardial infarction,Old myocardial infarction,Personal history of noncompliance with medical treatment, presenting hazards to health,Personal history of noncompliance with medical treatment, presenting hazards to health,Personal history of noncompliance with medical treatment, presenting hazards to health,Personal history of urinary calculi,Personal history of urinary calculi,Personal history of urinary calculi,Personal history of other malignant neoplasm of skin,Personal history of other malignant neoplasm of skin,Personal history of other malignant neoplasm of skin : 6
## Generalized convulsive epilepsy, without mention of intractable epilepsy,Pneumonitis due to inhalation of food or vomitus,Ventilator associated pneumonia,Acute respiratory failure,Syncope and collapse,Central nervous system muscle-tone depressants causing adverse effects in therapeutic use,Toxic diffuse goiter without mention of thyrotoxic crisis or storm,Cocaine dependence, in remission,Personal history of noncompliance with medical treatment, presenting hazards to health,Surgical or other procedure not carried out because of patient's decision,Tobacco use disorder : 6
## Influenza with pneumonia,Subendocardial infarction, initial episode of care,Congestive heart failure, unspecified,Chronic diastolic heart failure,Dehydration,Chronic kidney disease, unspecified,Atrial fibrillation,Other and unspecified hyperlipidemia,Unspecified essential hypertension,Diabetes with ophthalmic manifestations, type II or unspecified type, not stated as uncontrolled,Background diabetic retinopathy,Diabetes with renal manifestations, type II or unspecified type, not stated as uncontrolled,Nephritis and nephropathy, not specified as acute or chronic, in diseases classified elsewhere,Aortocoronary bypass status,Other diseases of lung, not elsewhere classified,Coronary atherosclerosis of native coronary artery,Nontoxic multinodular goiter,Benign neoplasm of adrenal gland : 6
## Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Other malignant lymphomas, unspecified site, extranodal and solid organ sites,Acute kidney failure with lesion of tubular necrosis,Acute kidney failure with lesion of tubular necrosis,Acute kidney failure with lesion of tubular necrosis,Acute kidney failure with lesion of tubular necrosis,Myelopathy in other diseases classified elsewhere,Myelopathy in other diseases classified elsewhere,Myelopathy in other diseases classified elsewhere,Myelopathy in other diseases classified elsewhere,Other pulmonary embolism and infarction,Other pulmonary embolism and infarction,Other pulmonary embolism and infarction,Other pulmonary embolism and infarction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Calculus of gallbladder with acute cholecystitis, without mention of obstruction,Cellulitis and abscess of trunk,Cellulitis and abscess of trunk,Cellulitis and abscess of trunk,Cellulitis and abscess of trunk,Pneumocystosis,Pneumocystosis,Pneumocystosis,Pneumocystosis,Septicemia due to pseudomonas,Septicemia due to pseudomonas,Septicemia due to pseudomonas,Septicemia due to pseudomonas,Sepsis,Sepsis,Sepsis,Sepsis,Pyelonephritis, unspecified,Pyelonephritis, unspecified,Pyelonephritis, unspecified,Pyelonephritis, unspecified,Encephalopathy, unspecified,Encephalopathy, unspecified,Encephalopathy, unspecified,Encephalopathy, unspecified,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Nonpyogenic meningitis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Disseminated candidiasis,Blood in stool,Blood in stool,Blood in stool,Blood in stool,Hemorrhage complicating a procedure,Hemorrhage complicating a procedure,Hemorrhage complicating a procedure,Hemorrhage complicating a procedure,Cerebral artery occlusion, unspecified with cerebral infarction,Cerebral artery occlusion, unspecified with cerebral infarction,Cerebral artery occlusion, unspecified with cerebral infarction,Cerebral artery occlusion, unspecified with cerebral infarction,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hydronephrosis,Hyposmolality and/or hyponatremia,Hyposmolality and/or hyponatremia,Hyposmolality and/or hyponatremia,Hyposmolality and/or hyponatremia,Delirium due to conditions classified elsewhere,Delirium due to conditions classified elsewhere,Delirium due to conditions classified elsewhere,Delirium due to conditions classified elsewhere,Acute kidney failure with other specified pathological lesion in kidney,Acute kidney failure with other specified pathological lesion in kidney,Acute kidney failure with other specified pathological lesion in kidney,Acute kidney failure with other specified pathological lesion in kidney,Neutropenia, unspecified,Neutropenia, unspecified,Neutropenia, unspecified,Neutropenia, unspecified,Other stomatitis and mucositis (ulcerative),Other stomatitis and mucositis (ulcerative),Other stomatitis and mucositis (ulcerative),Other stomatitis and mucositis (ulcerative),Fever presenting with conditions classified elsewhere,Fever presenting with conditions classified elsewhere,Fever presenting with conditions classified elsewhere,Fever presenting with conditions classified elsewhere,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Antineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use,Other and unspecified infection due to central venous catheter,Other and unspecified infection due to central venous catheter,Other and unspecified infection due to central venous catheter,Other and unspecified infection due to central venous catheter,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin resistant Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Other specified procedures as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Laparoscopic surgical procedure converted to open procedure,Laparoscopic surgical procedure converted to open procedure,Laparoscopic surgical procedure converted to open procedure,Laparoscopic surgical procedure converted to open procedure,Other shock without mention of trauma,Other shock without mention of trauma,Other shock without mention of trauma,Other shock without mention of trauma,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Removal of other organ (partial) (total) causing abnormal patient reaction, or later complication, without mention of misadventure at time of operation,Anticoagulants causing adverse effects in therapeutic use,Anticoagulants causing adverse effects in therapeutic use,Anticoagulants causing adverse effects in therapeutic use,Anticoagulants causing adverse effects in therapeutic use,Stricture or kinking of ureter,Stricture or kinking of ureter,Stricture or kinking of ureter,Stricture or kinking of ureter,Blisters, epidermal loss [second degree] of back [any part],Blisters, epidermal loss [second degree] of back [any part],Blisters, epidermal loss [second degree] of back [any part],Blisters, epidermal loss [second degree] of back [any part],Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Radiological procedure and radiotherapy as the cause of abnormal reaction of patient, or of later complication, without mention of misadventure at time of procedure,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, buttock,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Pressure ulcer, stage II,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypertensive chronic kidney disease, unspecified, with chronic kidney disease stage I through stage IV, or unspecified,Hypercalcemia,Hypercalcemia,Hypercalcemia,Hypercalcemia: 6
## Other tracheostomy complications,Atrial fibrillation,Chronic respiratory failure,Dependence on respirator, status,Diastolic heart failure, unspecified,Congestive heart failure, unspecified,Pure hypercholesterolemia,Other diseases of trachea and bronchus,Personal history of malignant neoplasm of thyroid,Gastrostomy status,Unspecified acquired hypothyroidism,Other chronic pulmonary heart diseases,Anemia, unspecified : 6
## Other tracheostomy complications,Other tracheostomy complications,Congestive heart failure, unspecified,Congestive heart failure, unspecified,Chronic respiratory failure,Chronic respiratory failure,Chronic airway obstruction, not elsewhere classified,Chronic airway obstruction, not elsewhere classified,Other and unspecified angina pectoris,Other and unspecified angina pectoris,Urinary complications, not elsewhere classified,Urinary complications, not elsewhere classified,Urinary tract infection, site not specified,Urinary tract infection, site not specified,Infection and inflammatory reaction due to other vascular device, implant, and graft,Infection and inflammatory reaction due to other vascular device, implant, and graft,Bacteremia,Bacteremia,Methicillin susceptible pneumonia due to Staphylococcus aureus,Methicillin susceptible pneumonia due to Staphylococcus aureus,Hematoma complicating a procedure,Hematoma complicating a procedure,Other diseases of trachea and bronchus,Other diseases of trachea and bronchus,Nontoxic multinodular goiter,Nontoxic multinodular goiter,Other abnormal granulation tissue,Other abnormal granulation tissue,Morbid obesity,Morbid obesity,Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled,Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled,Coronary atherosclerosis of native coronary artery,Coronary atherosclerosis of native coronary artery,Friedl?nder's bacillus infection in conditions classified elsewhere and of unspecified site,Friedl?nder's bacillus infection in conditions classified elsewhere and of unspecified site,Proteus (mirabilis) (morganii) infection in conditions classified elsewhere and of unspecified site,Proteus (mirabilis) (morganii) infection in conditions classified elsewhere and of unspecified site,Methicillin susceptible Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Methicillin susceptible Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Infection with microorganisms resistant to penicillins,Infection with microorganisms resistant to penicillins,Ileostomy status,Ileostomy status : 6
## (Other) :18096
## drug_list
## NULL :13038
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 377
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 301
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 173
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium: 147
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 145
## (Other) : 3951
## drug_name
## NULL :13040
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 377
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 302
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 173
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium: 147
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 145
## (Other) : 3948
## generic_drug_name
## NULL :13040
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 377
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 301
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 173
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium: 146
## Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium : 145
## (Other) : 3950
## drug_type
## NULL :13038
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 378
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 305
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 177
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN: 151
## MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN : 146
## (Other) : 3937
## formulary
## NULL :13038
## LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25 : 82
## LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50,LEVO50 : 69
## LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75,LEVO75 : 60
## LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100,LEVO100: 48
## LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125,LEVO125: 45
## (Other) : 4790
## ndc
## NULL :13038
## 00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113 : 68
## 00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211 : 53
## 00074518211,00074518211,00074518211,00074518211,00074518211,00074518211,00074518211,00074518211,00074518211 : 44
## 00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211,00074455211: 37
## 00074662411,00074662411,00074662411,00074662411,00074662411,00074662411,00074662411,00074662411,00074662411 : 36
## (Other) : 4856
## gsn
## NULL :13038
## 006648,006648,006648,006648,006648,006648,006648,006648,006648: 82
## 006649,006649,006649,006649,006649,006649,006649,006649,006649: 69
## 006650,006650,006650,006650,006650,006650,006650,006650,006650: 60
## 006651,006651,006651,006651,006651,006651,006651,006651,006651: 48
## 006653,006653,006653,006653,006653,006653,006653,006653,006653: 45
## (Other) : 4790
Getting age of patients:
for (i in 1:nrow(thyroid_df)){
start_date = as.Date(thyroid_df$dob[i])
end_date = as.Date(thyroid_df$charttime[i])
thyroid_df$age[i] <- floor(lubridate::time_length(difftime(end_date, start_date), "years"))
}
# In MIMIC-III, patients with age > 89 are coded in as impossibile values
thyroid_df$age <- ifelse(thyroid_df$age > 89, 89, thyroid_df$age)
# Since age cannot be 0, saving it as 1
thyroid_df$age <- ifelse(thyroid_df$age == 0, 1, thyroid_df$age)
head(thyroid_df,5)
## subject_id hadm_id gender dob itemid value valuenum valueuom
## 1 3 145834 M 2025-04-11 00:00:00 50993 3.40 3.4 uU/ML
## 2 3 145834 M 2025-04-11 00:00:00 50994 6.60 6.6 uG/DL
## 3 4 185777 F 2143-05-12 00:00:00 50993 0.35 0.35 uIU/mL
## 4 13 143045 F 2127-02-27 00:00:00 50993 1.20 1.2 uIU/mL
## 5 19 109235 M 1808-08-05 00:00:00 50993 1.50 1.5 uIU/mL
## charttime label fluid
## 1 2101-10-21 13:00:00 Thyroid Stimulating Hormone Blood
## 2 2101-10-21 13:00:00 Thyroxine (T4) Blood
## 3 2191-03-16 05:42:00 Thyroid Stimulating Hormone Blood
## 4 2167-01-09 07:11:00 Thyroid Stimulating Hormone Blood
## 5 2108-08-05 15:00:00 Thyroid Stimulating Hormone Blood
## icd9_list
## 1 0389,78559,5849,4275,41071,4280,6826,4254,2639
## 2 0389,78559,5849,4275,41071,4280,6826,4254,2639
## 3 042,1363,7994,2763,7907,5715,04111,V090,E9317
## 4 41401,4111,25000,4019,2720
## 5 80502,5990,5964,E8809,8220,73300,2948,4019,44321
## icd9_short_list
## 1 Septicemia NOS,Shock w/o trauma NEC,Acute kidney failure NOS,Cardiac arrest,Subendo infarct, initial,CHF NOS,Cellulitis of leg,Prim cardiomyopathy NEC,Protein-cal malnutr NOS
## 2 Septicemia NOS,Shock w/o trauma NEC,Acute kidney failure NOS,Cardiac arrest,Subendo infarct, initial,CHF NOS,Cellulitis of leg,Prim cardiomyopathy NEC,Protein-cal malnutr NOS
## 3 Human immuno virus dis,Pneumocystosis,Cachexia,Alkalosis,Bacteremia,Cirrhosis of liver NOS,Mth sus Stph aur els/NOS,Inf mcrg rstn pncllins,Adv eff antiviral drugs
## 4 Crnry athrscl natve vssl,Intermed coronary synd,DMII wo cmp nt st uncntr,Hypertension NOS,Pure hypercholesterolem
## 5 Fx c2 vertebra-closed,Urin tract infection NOS,Atony of bladder,Fall on stair/step NEC,Fracture patella-closed,Osteoporosis NOS,Mental disor NEC oth dis,Hypertension NOS,Dissect carotid artery
## icd9_long_list
## 1 Unspecified septicemia,Other shock without mention of trauma,Acute kidney failure, unspecified,Cardiac arrest,Subendocardial infarction, initial episode of care,Congestive heart failure, unspecified,Cellulitis and abscess of leg, except foot,Other primary cardiomyopathies,Unspecified protein-calorie malnutrition
## 2 Unspecified septicemia,Other shock without mention of trauma,Acute kidney failure, unspecified,Cardiac arrest,Subendocardial infarction, initial episode of care,Congestive heart failure, unspecified,Cellulitis and abscess of leg, except foot,Other primary cardiomyopathies,Unspecified protein-calorie malnutrition
## 3 Human immunodeficiency virus [HIV] disease,Pneumocystosis,Cachexia,Alkalosis,Bacteremia,Cirrhosis of liver without mention of alcohol,Methicillin susceptible Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Infection with microorganisms resistant to penicillins,Antiviral drugs causing adverse effects in therapeutic use
## 4 Coronary atherosclerosis of native coronary artery,Intermediate coronary syndrome,Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled,Unspecified essential hypertension,Pure hypercholesterolemia
## 5 Closed fracture of second cervical vertebra,Urinary tract infection, site not specified,Atony of bladder,Accidental fall on or from other stairs or steps,Closed fracture of patella,Osteoporosis, unspecified,Other persistent mental disorders due to conditions classified elsewhere,Unspecified essential hypertension,Dissection of carotid artery
## drug_list
## 1 NULL
## 2 NULL
## 3 Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium
## 4 NULL
## 5 NULL
## drug_name
## 1 NULL
## 2 NULL
## 3 Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium
## 4 NULL
## 5 NULL
## generic_drug_name
## 1 NULL
## 2 NULL
## 3 Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium
## 4 NULL
## 5 NULL
## drug_type
## 1 NULL
## 2 NULL
## 3 MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN
## 4 NULL
## 5 NULL
## formulary
## 1 NULL
## 2 NULL
## 3 LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25
## 4 NULL
## 5 NULL
## ndc
## 1 NULL
## 2 NULL
## 3 00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113
## 4 NULL
## 5 NULL
## gsn age
## 1 NULL 76
## 2 NULL 76
## 3 006648,006648,006648,006648,006648,006648,006648,006648,006648 47
## 4 NULL 39
## 5 NULL 89
Checking if the patient has been prescribed T4 and T3 medicine:
thyroid_df$t4_medication <- ifelse(str_detect(thyroid_df$drug_list, 'Levothyroxine') | str_detect(thyroid_df$drug_list, 'Synthroid'),"Yes","No") # T4
thyroid_df$t3_medication <- ifelse(str_detect(thyroid_df$drug_list, 'Liothyronine'),"Yes","No") # T3
head(thyroid_df,5)
## subject_id hadm_id gender dob itemid value valuenum valueuom
## 1 3 145834 M 2025-04-11 00:00:00 50993 3.40 3.4 uU/ML
## 2 3 145834 M 2025-04-11 00:00:00 50994 6.60 6.6 uG/DL
## 3 4 185777 F 2143-05-12 00:00:00 50993 0.35 0.35 uIU/mL
## 4 13 143045 F 2127-02-27 00:00:00 50993 1.20 1.2 uIU/mL
## 5 19 109235 M 1808-08-05 00:00:00 50993 1.50 1.5 uIU/mL
## charttime label fluid
## 1 2101-10-21 13:00:00 Thyroid Stimulating Hormone Blood
## 2 2101-10-21 13:00:00 Thyroxine (T4) Blood
## 3 2191-03-16 05:42:00 Thyroid Stimulating Hormone Blood
## 4 2167-01-09 07:11:00 Thyroid Stimulating Hormone Blood
## 5 2108-08-05 15:00:00 Thyroid Stimulating Hormone Blood
## icd9_list
## 1 0389,78559,5849,4275,41071,4280,6826,4254,2639
## 2 0389,78559,5849,4275,41071,4280,6826,4254,2639
## 3 042,1363,7994,2763,7907,5715,04111,V090,E9317
## 4 41401,4111,25000,4019,2720
## 5 80502,5990,5964,E8809,8220,73300,2948,4019,44321
## icd9_short_list
## 1 Septicemia NOS,Shock w/o trauma NEC,Acute kidney failure NOS,Cardiac arrest,Subendo infarct, initial,CHF NOS,Cellulitis of leg,Prim cardiomyopathy NEC,Protein-cal malnutr NOS
## 2 Septicemia NOS,Shock w/o trauma NEC,Acute kidney failure NOS,Cardiac arrest,Subendo infarct, initial,CHF NOS,Cellulitis of leg,Prim cardiomyopathy NEC,Protein-cal malnutr NOS
## 3 Human immuno virus dis,Pneumocystosis,Cachexia,Alkalosis,Bacteremia,Cirrhosis of liver NOS,Mth sus Stph aur els/NOS,Inf mcrg rstn pncllins,Adv eff antiviral drugs
## 4 Crnry athrscl natve vssl,Intermed coronary synd,DMII wo cmp nt st uncntr,Hypertension NOS,Pure hypercholesterolem
## 5 Fx c2 vertebra-closed,Urin tract infection NOS,Atony of bladder,Fall on stair/step NEC,Fracture patella-closed,Osteoporosis NOS,Mental disor NEC oth dis,Hypertension NOS,Dissect carotid artery
## icd9_long_list
## 1 Unspecified septicemia,Other shock without mention of trauma,Acute kidney failure, unspecified,Cardiac arrest,Subendocardial infarction, initial episode of care,Congestive heart failure, unspecified,Cellulitis and abscess of leg, except foot,Other primary cardiomyopathies,Unspecified protein-calorie malnutrition
## 2 Unspecified septicemia,Other shock without mention of trauma,Acute kidney failure, unspecified,Cardiac arrest,Subendocardial infarction, initial episode of care,Congestive heart failure, unspecified,Cellulitis and abscess of leg, except foot,Other primary cardiomyopathies,Unspecified protein-calorie malnutrition
## 3 Human immunodeficiency virus [HIV] disease,Pneumocystosis,Cachexia,Alkalosis,Bacteremia,Cirrhosis of liver without mention of alcohol,Methicillin susceptible Staphylococcus aureus in conditions classified elsewhere and of unspecified site,Infection with microorganisms resistant to penicillins,Antiviral drugs causing adverse effects in therapeutic use
## 4 Coronary atherosclerosis of native coronary artery,Intermediate coronary syndrome,Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled,Unspecified essential hypertension,Pure hypercholesterolemia
## 5 Closed fracture of second cervical vertebra,Urinary tract infection, site not specified,Atony of bladder,Accidental fall on or from other stairs or steps,Closed fracture of patella,Osteoporosis, unspecified,Other persistent mental disorders due to conditions classified elsewhere,Unspecified essential hypertension,Dissection of carotid artery
## drug_list
## 1 NULL
## 2 NULL
## 3 Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium
## 4 NULL
## 5 NULL
## drug_name
## 1 NULL
## 2 NULL
## 3 Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium
## 4 NULL
## 5 NULL
## generic_drug_name
## 1 NULL
## 2 NULL
## 3 Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium,Levothyroxine Sodium
## 4 NULL
## 5 NULL
## drug_type
## 1 NULL
## 2 NULL
## 3 MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN,MAIN
## 4 NULL
## 5 NULL
## formulary
## 1 NULL
## 2 NULL
## 3 LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25,LEVO25
## 4 NULL
## 5 NULL
## ndc
## 1 NULL
## 2 NULL
## 3 00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113,00074434113
## 4 NULL
## 5 NULL
## gsn age
## 1 NULL 76
## 2 NULL 76
## 3 006648,006648,006648,006648,006648,006648,006648,006648,006648 47
## 4 NULL 39
## 5 NULL 89
## t4_medication t3_medication
## 1 No No
## 2 No No
## 3 Yes No
## 4 No No
## 5 No No
Removing unwanted columns:
cleaned_thyroid <- subset(thyroid_df, select = -c(drug_list, generic_drug_name, drug_type, dob, subject_id,formulary, ndc, gsn, drug_name, fluid, itemid, icd9_list, icd9_short_list, icd9_long_list, charttime))
summary(cleaned_thyroid)
## hadm_id gender value valuenum valueuom
## Min. :100001 F:9450 Min. : 0.020 1.1 : 782 uIU/mL :10803
## 1st Qu.:124705 M:8682 1st Qu.: 1.100 1.2 : 688 ng/dL : 3747
## Median :149789 Median : 2.000 1.3 : 609 ug/dL : 1555
## Mean :149679 Mean : 8.703 1.4 : 535 uU/ML : 1259
## 3rd Qu.:174662 3rd Qu.: 4.900 1.5 : 502 ng/dl : 336
## Max. :199993 Max. :3890.000 1 : 428 uG/DL : 188
## NA's :187 (Other):14588 (Other): 244
## label age t4_medication
## Thyroglobulin : 47 Min. : 1.00 Length:18132
## Thyroid Peroxidase Antibodies: 65 1st Qu.:54.00 Class :character
## Thyroid Stimulating Hormone :12063 Median :67.00 Mode :character
## Thyroxine (T4) : 1743 Mean :64.48
## Thyroxine (T4), Free : 3031 3rd Qu.:79.00
## Triiodothyronine (T3) : 1183 Max. :89.00
##
## t3_medication
## Length:18132
## Class :character
## Mode :character
##
##
##
##
Checking how many unique hadm_id is there out of 18132 observations:
data.table::uniqueN(cleaned_thyroid[['hadm_id']])
## [1] 12161
Creating 6 different dataframes, one with all entries for each label, for eventually getting records of unique hadm_id
1. Dataframe with column “label” having just Thyroglobulin:
cleaned_thyroid_tg <- cleaned_thyroid
cleaned_thyroid_tg <- cleaned_thyroid_tg[cleaned_thyroid_tg$label == "Thyroglobulin", ]
head(cleaned_thyroid_tg,5)
## hadm_id gender value valuenum valueuom label age t4_medication
## 200 150352 M 8 8 ng/mL Thyroglobulin 56 No
## 241 114791 M 4 4 ng/mL Thyroglobulin 84 No
## 705 186403 F 13 13 ng/mL Thyroglobulin 64 No
## 987 112512 F NA NULL ng/mL Thyroglobulin 69 No
## 2320 115026 M 27 27 ng/mL Thyroglobulin 70 Yes
## t3_medication
## 200 No
## 241 No
## 705 No
## 987 Yes
## 2320 No
2. Dataframe with column “label” having just Thyroid Peroxidase Antibodies:
cleaned_thyroid_tpa <- cleaned_thyroid
cleaned_thyroid_tpa <- cleaned_thyroid_tpa[cleaned_thyroid_tpa$label == "Thyroid Peroxidase Antibodies", ]
head(cleaned_thyroid_tpa,5)
## hadm_id gender value valuenum valueuom label age
## 242 114791 M NA NULL IU/mL Thyroid Peroxidase Antibodies 84
## 696 157668 F NA NULL IU/mL Thyroid Peroxidase Antibodies 72
## 2409 100319 M 14 14 IU/mL Thyroid Peroxidase Antibodies 70
## 2718 142869 F NA NULL IU/mL Thyroid Peroxidase Antibodies 30
## 2816 122241 F NA NULL IU/mL Thyroid Peroxidase Antibodies 36
## t4_medication t3_medication
## 242 No No
## 696 No No
## 2409 No No
## 2718 Yes No
## 2816 No No
3. Dataframe with column “label” having just Thyroid Stimulating Hormone:
cleaned_thyroid_tsh <- cleaned_thyroid
cleaned_thyroid_tsh <- cleaned_thyroid_tsh[cleaned_thyroid_tsh$label == "Thyroid Stimulating Hormone", ]
head(cleaned_thyroid_tsh,5)
## hadm_id gender value valuenum valueuom label age
## 1 145834 M 3.40 3.4 uU/ML Thyroid Stimulating Hormone 76
## 3 185777 F 0.35 0.35 uIU/mL Thyroid Stimulating Hormone 47
## 4 143045 F 1.20 1.2 uIU/mL Thyroid Stimulating Hormone 39
## 5 109235 M 1.50 1.5 uIU/mL Thyroid Stimulating Hormone 89
## 6 165315 F 2.50 2.5 uU/ML Thyroid Stimulating Hormone 64
## t4_medication t3_medication
## 1 No No
## 3 Yes No
## 4 No No
## 5 No No
## 6 No No
4. Dataframe with column “label” having just Thyroxine (T4):
cleaned_thyroid_t4 <- cleaned_thyroid
cleaned_thyroid_t4 <- cleaned_thyroid_t4[cleaned_thyroid_t4$label == "Thyroxine (T4)", ]
head(cleaned_thyroid_t4,5)
## hadm_id gender value valuenum valueuom label age t4_medication
## 2 145834 M 6.6 6.6 uG/DL Thyroxine (T4) 76 No
## 16 189535 M 4.4 4.4 ug/dL Thyroxine (T4) 55 No
## 43 190707 M 5.2 5.2 uG/DL Thyroxine (T4) 85 No
## 51 156461 M 10.3 10.3 uG/DL Thyroxine (T4) 1 No
## 61 135671 F 7.2 7.2 uG/DL Thyroxine (T4) 75 No
## t3_medication
## 2 No
## 16 No
## 43 No
## 51 No
## 61 No
5. Dataframe with column “label” having just Thyroxine (T4), Free:
cleaned_thyroid_t4_free <- cleaned_thyroid
cleaned_thyroid_t4_free <- cleaned_thyroid_t4_free[cleaned_thyroid_t4_free$label == "Thyroxine (T4), Free", ]
head(cleaned_thyroid_t4_free,5)
## hadm_id gender value valuenum valueuom label age
## 9 144319 M 1.1 1.1 ng/dL Thyroxine (T4), Free 89
## 13 181750 M 1.3 1.3 ng/dL Thyroxine (T4), Free 80
## 17 189535 M 1.0 1 ng/dL Thyroxine (T4), Free 55
## 26 161160 F 1.3 1.3 ng/dL Thyroxine (T4), Free 35
## 30 125288 F 1.2 1.2 ng/dL Thyroxine (T4), Free 24
## t4_medication t3_medication
## 9 Yes No
## 13 No No
## 17 No No
## 26 No No
## 30 No No
6. Dataframe with column “label” having just Triiodothyronine (T3):
cleaned_thyroid_t3 <- cleaned_thyroid
cleaned_thyroid_t3 <- cleaned_thyroid_t3[cleaned_thyroid_t3$label == "Triiodothyronine (T3)", ]
head(cleaned_thyroid_t3,5)
## hadm_id gender value valuenum valueuom label age
## 27 161160 F 49 49 ng/dL Triiodothyronine (T3) 35
## 62 135671 F 74 74 NG/DL Triiodothyronine (T3) 75
## 66 101148 F 58 58 ng/dL Triiodothyronine (T3) 85
## 71 197273 M 53 53 ng/dL Triiodothyronine (T3) 63
## 75 137006 F 30 30 ng/dL Triiodothyronine (T3) 68
## t4_medication t3_medication
## 27 No No
## 62 No No
## 66 Yes No
## 71 Yes No
## 75 No No
Merging dataframes to get data in desirable format
1. Merging dataframe with column “label” having just Thyroglobulin and dataframe with column “label” having just Thyroid Peroxidase Antibodies:
merged_df <- merge(cleaned_thyroid_tg, cleaned_thyroid_tpa, by.x = "hadm_id", by.y = "hadm_id", all.x = TRUE, all.y = TRUE)
merged_df <- rename(merged_df, c("gender.x"="gender_tg", "value.x"="tg_value_inNumeric", "valuenum.x"="tg_value", "valueuom.x"="tg_value_unit", "age.x"="age_tg", "label.x"="label_tg", "t4_medication.x"="t4_medication_tg", "t3_medication.x"="t3_medication_tg","gender.y"="gender_tpa", "value.y"="tpa_value_inNumeric", "valuenum.y"="tpa_value", "valueuom.y"="tpa_value_unit", "label.y"="label_tpa", "age.y"="age_tpa", "t4_medication.y"="t4_medication_tpa", "t3_medication.y"="t3_medication_tpa"))
merged_df$age <- ifelse(!is.na(merged_df$age_tg) & is.na(merged_df$age_tpa), merged_df$age_tg, merged_df$age_tpa)
merged_df$gender <- ifelse(!is.na(merged_df$gender_tg) & is.na(merged_df$gender_tpa), merged_df$gender_tg, merged_df$gender_tpa)
merged_df$t3_medication <- ifelse(!is.na(merged_df$t3_medication_tg) & is.na(merged_df$t3_medication_tpa), merged_df$t3_medication_tg, merged_df$t3_medication_tpa)
merged_df$t4_medication <- ifelse(!is.na(merged_df$t4_medication_tg) & is.na(merged_df$t4_medication_tpa), merged_df$t4_medication_tg, merged_df$t4_medication_tpa)
merged_df <- subset(merged_df, select = -c(gender_tg, gender_tpa, age_tg, age_tpa, tg_value, tpa_value, t3_medication_tpa, t3_medication_tg, t4_medication_tg, t4_medication_tpa))
head(merged_df,5)
## hadm_id tg_value_inNumeric tg_value_unit label_tg tpa_value_inNumeric
## 1 100253 NA <NA> <NA> 289
## 2 100319 NA <NA> <NA> 14
## 3 100398 1530 ng/mL Thyroglobulin NA
## 4 101015 66 ng/mL Thyroglobulin NA
## 5 101432 1990 ng/mL Thyroglobulin NA
## tpa_value_unit label_tpa age gender t3_medication
## 1 IU/mL Thyroid Peroxidase Antibodies 26 1 No
## 2 IU/mL Thyroid Peroxidase Antibodies 70 2 No
## 3 <NA> <NA> 75 1 No
## 4 <NA> <NA> 1 1 No
## 5 <NA> <NA> 64 2 No
## t4_medication
## 1 No
## 2 No
## 3 Yes
## 4 Yes
## 5 No
str(merged_df)
## 'data.frame': 101 obs. of 11 variables:
## $ hadm_id : int 100253 100319 100398 101015 101432 103103 103198 104841 105316 109134 ...
## $ tg_value_inNumeric : num NA NA 1530 66 1990 NA NA NA NA NA ...
## $ tg_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA 5 5 5 NA NA 5 NA NA ...
## $ label_tg : Factor w/ 6 levels "Thyroglobulin",..: NA NA 1 1 1 NA NA 1 NA NA ...
## $ tpa_value_inNumeric: num 289 14 NA NA NA 39 30 NA 103 124 ...
## $ tpa_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: 1 1 NA NA NA 1 1 NA 1 1 ...
## $ label_tpa : Factor w/ 6 levels "Thyroglobulin",..: 2 2 NA NA NA 2 2 NA 2 2 ...
## $ age : num 26 70 75 1 64 28 88 52 52 59 ...
## $ gender : int 1 2 1 1 2 2 2 2 2 1 ...
## $ t3_medication : chr "No" "No" "No" "No" ...
## $ t4_medication : chr "No" "No" "Yes" "Yes" ...
2. Merging dataframe, got after merging in step 1 above, and dataframe with column “label” having just Thyroxine (T4):
merged_df2 <- merge(merged_df, cleaned_thyroid_t4, by.x = "hadm_id", by.y = "hadm_id", all.x = TRUE, all.y = TRUE)
merged_df2 <- rename(merged_df2, c("value"="t4_value_inNumeric", "valuenum"="t4_value", "valueuom"="t4_value_unit", "label"="label_t4"))
merged_df2$age <- ifelse(!is.na(merged_df2$age.x) & is.na(merged_df2$age.y), merged_df2$age.x, merged_df2$age.y)
merged_df2$gender <- ifelse(!is.na(merged_df2$gender.x) & is.na(merged_df2$gender.y), merged_df2$gender.x, merged_df2$gender.y)
merged_df2$t3_medication <- ifelse(!is.na(merged_df2$t3_medication.x) & is.na(merged_df2$t3_medication.y), merged_df2$t3_medication.x, merged_df2$t3_medication.y)
merged_df2$t4_medication <- ifelse(!is.na(merged_df2$t4_medication.x) & is.na(merged_df2$t4_medication.y), merged_df2$t4_medication.x, merged_df2$t4_medication.y)
merged_df2 <- subset(merged_df2, select = -c(gender.x, gender.y, age.x, age.y, t4_value, t3_medication.x, t3_medication.y, t4_medication.x, t4_medication.y))
head(merged_df2,5)
## hadm_id tg_value_inNumeric tg_value_unit label_tg tpa_value_inNumeric
## 1 100045 NA <NA> <NA> NA
## 2 100096 NA <NA> <NA> NA
## 3 100156 NA <NA> <NA> NA
## 4 100210 NA <NA> <NA> NA
## 5 100253 NA <NA> <NA> 289
## tpa_value_unit label_tpa t4_value_inNumeric t4_value_unit
## 1 <NA> <NA> 19.8 ug/dL
## 2 <NA> <NA> 4.5 ug/dL
## 3 <NA> <NA> 5.9 uG/DL
## 4 <NA> <NA> 5.1 ug/dL
## 5 IU/mL Thyroid Peroxidase Antibodies 16.2 ug/dL
## label_t4 age gender t3_medication t4_medication
## 1 Thyroxine (T4) 69 1 No No
## 2 Thyroxine (T4) 1 1 No Yes
## 3 Thyroxine (T4) 69 2 No No
## 4 Thyroxine (T4) 52 2 No No
## 5 Thyroxine (T4) 26 1 No No
str(merged_df2)
## 'data.frame': 1792 obs. of 14 variables:
## $ hadm_id : int 100045 100096 100156 100210 100253 100262 100300 100302 100319 100343 ...
## $ tg_value_inNumeric : num NA NA NA NA NA NA NA NA NA NA ...
## $ tg_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA NA NA NA NA NA ...
## $ label_tg : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA NA NA NA NA NA ...
## $ tpa_value_inNumeric: num NA NA NA NA 289 NA NA NA 14 NA ...
## $ tpa_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA 1 NA NA NA 1 NA ...
## $ label_tpa : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA 2 NA NA NA 2 NA ...
## $ t4_value_inNumeric : num 19.8 4.5 5.9 5.1 16.2 6.9 5.4 7.4 NA 8.9 ...
## $ t4_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: 7 7 8 7 7 7 7 8 NA 7 ...
## $ label_t4 : Factor w/ 6 levels "Thyroglobulin",..: 4 4 4 4 4 4 4 4 NA 4 ...
## $ age : num 69 1 69 52 26 21 49 48 70 74 ...
## $ gender : int 1 1 2 2 1 2 2 1 2 1 ...
## $ t3_medication : chr "No" "No" "No" "No" ...
## $ t4_medication : chr "No" "Yes" "No" "No" ...
3. Merging dataframe, got after merging in step 2 above, and dataframe with column “label” having just Triiodothyronine (T3):
merged_df3 <- merge(merged_df2, cleaned_thyroid_t3, by.x = "hadm_id", by.y = "hadm_id", all.x = TRUE, all.y = TRUE)
merged_df3 <- rename(merged_df3, c("value"="t3_value_inNumeric", "valuenum"="t3_value", "valueuom"="t3_value_unit", "label"="label_t3"))
merged_df3$age <- ifelse(!is.na(merged_df3$age.x) & is.na(merged_df3$age.y), merged_df3$age.x, merged_df3$age.y)
merged_df3$gender <-ifelse(!is.na(merged_df3$gender.x) & is.na(merged_df3$gender.y), merged_df3$gender.x, merged_df3$gender.y)
merged_df3$t3_medication <- ifelse(!is.na(merged_df3$t3_medication.x) & is.na(merged_df3$t3_medication.y), merged_df3$t3_medication.x, merged_df3$t3_medication.y)
merged_df3$t4_medication <- ifelse(!is.na(merged_df3$t4_medication.x) & is.na(merged_df3$t4_medication.y), merged_df3$t4_medication.x, merged_df3$t4_medication.y)
merged_df3 <- subset(merged_df3, select = -c(gender.x, gender.y, age.x, age.y, t3_value, t3_medication.x, t3_medication.y, t4_medication.x, t4_medication.y))
head(merged_df3,5)
## hadm_id tg_value_inNumeric tg_value_unit label_tg tpa_value_inNumeric
## 1 100045 NA <NA> <NA> NA
## 2 100096 NA <NA> <NA> NA
## 3 100130 NA <NA> <NA> NA
## 4 100156 NA <NA> <NA> NA
## 5 100210 NA <NA> <NA> NA
## tpa_value_unit label_tpa t4_value_inNumeric t4_value_unit label_t4
## 1 <NA> <NA> 19.8 ug/dL Thyroxine (T4)
## 2 <NA> <NA> 4.5 ug/dL Thyroxine (T4)
## 3 <NA> <NA> NA <NA> <NA>
## 4 <NA> <NA> 5.9 uG/DL Thyroxine (T4)
## 5 <NA> <NA> 5.1 ug/dL Thyroxine (T4)
## t3_value_inNumeric t3_value_unit label_t3 age gender
## 1 95 ng/dL Triiodothyronine (T3) 69 1
## 2 143 ng/dL Triiodothyronine (T3) 1 1
## 3 98 NG/DL Triiodothyronine (T3) 56 1
## 4 NA <NA> <NA> 69 2
## 5 85 ng/dL Triiodothyronine (T3) 52 2
## t3_medication t4_medication
## 1 No No
## 2 No Yes
## 3 No No
## 4 No No
## 5 No No
str(merged_df3)
## 'data.frame': 2109 obs. of 17 variables:
## $ hadm_id : int 100045 100096 100130 100156 100210 100253 100262 100300 100302 100319 ...
## $ tg_value_inNumeric : num NA NA NA NA NA NA NA NA NA NA ...
## $ tg_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA NA NA NA NA NA ...
## $ label_tg : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA NA NA NA NA NA ...
## $ tpa_value_inNumeric: num NA NA NA NA NA 289 NA NA NA 14 ...
## $ tpa_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA 1 NA NA NA 1 ...
## $ label_tpa : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA 2 NA NA NA 2 ...
## $ t4_value_inNumeric : num 19.8 4.5 NA 5.9 5.1 16.2 6.9 5.4 7.4 NA ...
## $ t4_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: 7 7 NA 8 7 7 7 7 8 NA ...
## $ label_t4 : Factor w/ 6 levels "Thyroglobulin",..: 4 4 NA 4 4 4 4 4 4 NA ...
## $ t3_value_inNumeric : num 95 143 98 NA 85 158 NA NA 55 NA ...
## $ t3_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: 3 3 4 NA 3 3 NA NA 4 NA ...
## $ label_t3 : Factor w/ 6 levels "Thyroglobulin",..: 6 6 6 NA 6 6 NA NA 6 NA ...
## $ age : num 69 1 56 69 52 26 21 49 48 70 ...
## $ gender : int 1 1 1 2 2 1 2 2 1 2 ...
## $ t3_medication : chr "No" "No" "No" "No" ...
## $ t4_medication : chr "No" "Yes" "No" "No" ...
4. Merging dataframe, got after merging in step 3 above, and dataframe with column “label” having just Thyroxine (T4), Free:
merged_df4 <- merge(merged_df3, cleaned_thyroid_t4_free, by.x = "hadm_id", by.y = "hadm_id", all.x = TRUE, all.y = TRUE)
merged_df4 <- rename(merged_df4, c("value"="t4_free_value_inNumeric", "valuenum"="t4_free_value", "valueuom"="t4_free_value_unit", "label"="label_t4_free"))
merged_df4$age <- ifelse(!is.na(merged_df4$age.x) & is.na(merged_df4$age.y), merged_df4$age.x, merged_df4$age.y)
merged_df4$gender <- ifelse(!is.na(merged_df4$gender.x) & is.na(merged_df4$gender.y), merged_df4$gender.x, merged_df4$gender.y)
merged_df4$t3_medication <- ifelse(!is.na(merged_df4$t3_medication.x) & is.na(merged_df4$t3_medication.y), merged_df4$t3_medication.x, merged_df4$t3_medication.y)
merged_df4$t4_medication <- ifelse(!is.na(merged_df4$t4_medication.x) & is.na(merged_df4$t4_medication.y), merged_df4$t4_medication.x, merged_df4$t4_medication.y)
merged_df4 <- subset(merged_df4, select = -c(gender.x, gender.y, age.x, age.y, t4_free_value, t3_medication.x, t3_medication.y, t4_medication.x, t4_medication.y))
head(merged_df4,5)
## hadm_id tg_value_inNumeric tg_value_unit label_tg tpa_value_inNumeric
## 1 100017 NA <NA> <NA> NA
## 2 100045 NA <NA> <NA> NA
## 3 100096 NA <NA> <NA> NA
## 4 100130 NA <NA> <NA> NA
## 5 100153 NA <NA> <NA> NA
## tpa_value_unit label_tpa t4_value_inNumeric t4_value_unit label_t4
## 1 <NA> <NA> NA <NA> <NA>
## 2 <NA> <NA> 19.8 ug/dL Thyroxine (T4)
## 3 <NA> <NA> 4.5 ug/dL Thyroxine (T4)
## 4 <NA> <NA> NA <NA> <NA>
## 5 <NA> <NA> NA <NA> <NA>
## t3_value_inNumeric t3_value_unit label_t3
## 1 NA <NA> <NA>
## 2 95 ng/dL Triiodothyronine (T3)
## 3 143 ng/dL Triiodothyronine (T3)
## 4 98 NG/DL Triiodothyronine (T3)
## 5 NA <NA> <NA>
## t4_free_value_inNumeric t4_free_value_unit label_t4_free age gender
## 1 1.30 ng/dl Thyroxine (T4), Free 27 2
## 2 1.50 ng/dL Thyroxine (T4), Free 69 1
## 3 0.70 ng/dL Thyroxine (T4), Free 1 1
## 4 0.80 ng/dl Thyroxine (T4), Free 56 1
## 5 0.91 ng/dL Thyroxine (T4), Free 79 2
## t3_medication t4_medication
## 1 No No
## 2 No No
## 3 No Yes
## 4 No No
## 5 No Yes
str(merged_df4)
## 'data.frame': 4130 obs. of 20 variables:
## $ hadm_id : int 100017 100045 100096 100130 100153 100156 100187 100210 100225 100247 ...
## $ tg_value_inNumeric : num NA NA NA NA NA NA NA NA NA NA ...
## $ tg_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA NA NA NA NA NA ...
## $ label_tg : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA NA NA NA NA NA ...
## $ tpa_value_inNumeric : num NA NA NA NA NA NA NA NA NA NA ...
## $ tpa_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA NA NA NA NA NA ...
## $ label_tpa : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA NA NA NA NA NA ...
## $ t4_value_inNumeric : num NA 19.8 4.5 NA NA 5.9 NA 5.1 NA NA ...
## $ t4_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA 7 7 NA NA 8 NA 7 NA NA ...
## $ label_t4 : Factor w/ 6 levels "Thyroglobulin",..: NA 4 4 NA NA 4 NA 4 NA NA ...
## $ t3_value_inNumeric : num NA 95 143 98 NA NA NA 85 NA NA ...
## $ t3_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA 3 3 4 NA NA NA 3 NA NA ...
## $ label_t3 : Factor w/ 6 levels "Thyroglobulin",..: NA 6 6 6 NA NA NA 6 NA NA ...
## $ t4_free_value_inNumeric: num 1.3 1.5 0.7 0.8 0.91 NA 1.2 0.88 1.4 0.5 ...
## $ t4_free_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: 2 3 3 2 3 NA 3 3 3 2 ...
## $ label_t4_free : Factor w/ 6 levels "Thyroglobulin",..: 5 5 5 5 5 NA 5 5 5 5 ...
## $ age : num 27 69 1 56 79 69 64 52 89 58 ...
## $ gender : int 2 1 1 1 2 2 1 2 1 1 ...
## $ t3_medication : chr "No" "No" "No" "No" ...
## $ t4_medication : chr "No" "No" "Yes" "No" ...
5. Merging dataframe, got after merging in step 4 above, and dataframe with column “label” having just Thyroid Stimulating Hormone:
merged_df5 <- merge(cleaned_thyroid_tsh, merged_df4, by.x = "hadm_id", by.y = "hadm_id", all.x = TRUE, all.y = TRUE)
merged_df5 <- rename(merged_df5, c("value"="tsh_value_inNumeric", "valuenum"="tsh_value", "valueuom"="tsh_value_unit", "label"="label_tsh"))
merged_df5$age <- ifelse(!is.na(merged_df5$age.x) & is.na(merged_df5$age.y), merged_df5$age.x, merged_df5$age.y)
merged_df5$gender <- ifelse(!is.na(merged_df5$gender.x) & is.na(merged_df5$gender.y), merged_df5$gender.x, merged_df5$gender.y)
merged_df5$t3_medication <- ifelse(!is.na(merged_df5$t3_medication.x) & is.na(merged_df5$t3_medication.y), merged_df5$t3_medication.x, merged_df5$t3_medication.y)
merged_df5$t4_medication <- ifelse(!is.na(merged_df5$t4_medication.x) & is.na(merged_df5$t4_medication.y), merged_df5$t4_medication.x, merged_df5$t4_medication.y)
merged_df5 <- subset(merged_df5, select = -c(gender.x, gender.y, age.x, age.y, tsh_value, t3_medication.x, t3_medication.y, t4_medication.x, t4_medication.y))
head(merged_df5,5)
## hadm_id tsh_value_inNumeric tsh_value_unit label_tsh
## 1 100001 3.80 uIU/mL Thyroid Stimulating Hormone
## 2 100006 NA uU/ML Thyroid Stimulating Hormone
## 3 100017 0.59 uU/ML Thyroid Stimulating Hormone
## 4 100018 0.87 uIU/mL Thyroid Stimulating Hormone
## 5 100021 3.00 uIU/mL Thyroid Stimulating Hormone
## tg_value_inNumeric tg_value_unit label_tg tpa_value_inNumeric tpa_value_unit
## 1 NA <NA> <NA> NA <NA>
## 2 NA <NA> <NA> NA <NA>
## 3 NA <NA> <NA> NA <NA>
## 4 NA <NA> <NA> NA <NA>
## 5 NA <NA> <NA> NA <NA>
## label_tpa t4_value_inNumeric t4_value_unit label_t4 t3_value_inNumeric
## 1 <NA> NA <NA> <NA> NA
## 2 <NA> NA <NA> <NA> NA
## 3 <NA> NA <NA> <NA> NA
## 4 <NA> NA <NA> <NA> NA
## 5 <NA> NA <NA> <NA> NA
## t3_value_unit label_t3 t4_free_value_inNumeric t4_free_value_unit
## 1 <NA> <NA> NA <NA>
## 2 <NA> <NA> NA <NA>
## 3 <NA> <NA> 1.3 ng/dl
## 4 <NA> <NA> NA <NA>
## 5 <NA> <NA> NA <NA>
## label_t4_free age gender t3_medication t4_medication
## 1 <NA> 35 1 No No
## 2 <NA> 48 1 No No
## 3 Thyroxine (T4), Free 27 2 No No
## 4 <NA> 55 2 No No
## 5 <NA> 54 2 No No
str(merged_df5)
## 'data.frame': 12161 obs. of 23 variables:
## $ hadm_id : int 100001 100006 100017 100018 100021 100045 100061 100065 100068 100088 ...
## $ tsh_value_inNumeric : num 3.8 NA 0.59 0.87 3 3.6 2.1 2 2.9 3.5 ...
## $ tsh_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: 9 10 10 9 9 9 9 9 9 9 ...
## $ label_tsh : Factor w/ 6 levels "Thyroglobulin",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ tg_value_inNumeric : num NA NA NA NA NA NA NA NA NA NA ...
## $ tg_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA NA NA NA NA NA ...
## $ label_tg : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA NA NA NA NA NA ...
## $ tpa_value_inNumeric : num NA NA NA NA NA NA NA NA NA NA ...
## $ tpa_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA NA NA NA NA NA ...
## $ label_tpa : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA NA NA NA NA NA ...
## $ t4_value_inNumeric : num NA NA NA NA NA 19.8 NA NA NA NA ...
## $ t4_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA 7 NA NA NA NA ...
## $ label_t4 : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA 4 NA NA NA NA ...
## $ t3_value_inNumeric : num NA NA NA NA NA 95 NA NA NA NA ...
## $ t3_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA NA NA NA 3 NA NA NA NA ...
## $ label_t3 : Factor w/ 6 levels "Thyroglobulin",..: NA NA NA NA NA 6 NA NA NA NA ...
## $ t4_free_value_inNumeric: num NA NA 1.3 NA NA 1.5 NA NA NA NA ...
## $ t4_free_value_unit : Factor w/ 10 levels "IU/mL","ng/dl",..: NA NA 2 NA NA 3 NA NA NA NA ...
## $ label_t4_free : Factor w/ 6 levels "Thyroglobulin",..: NA NA 5 NA NA 5 NA NA NA NA ...
## $ age : num 35 48 27 55 54 69 62 59 74 74 ...
## $ gender : int 1 1 2 2 2 1 1 2 1 1 ...
## $ t3_medication : chr "No" "No" "No" "No" ...
## $ t4_medication : chr "No" "No" "No" "No" ...
Checking for NAs in df_unique (which contains records for unique hadm_id):
df_unique <- merged_df5
colSums(is.na(df_unique))
## hadm_id tsh_value_inNumeric tsh_value_unit
## 0 192 98
## label_tsh tg_value_inNumeric tg_value_unit
## 98 12121 12114
## label_tg tpa_value_inNumeric tpa_value_unit
## 12114 12127 12096
## label_tpa t4_value_inNumeric t4_value_unit
## 12096 10422 10418
## label_t4 t3_value_inNumeric t3_value_unit
## 10418 11010 10978
## label_t3 t4_free_value_inNumeric t4_free_value_unit
## 10978 9149 9130
## label_t4_free age gender
## 9130 0 0
## t3_medication t4_medication
## 0 0
Removing unwanted columns:
df_unique <- subset(df_unique, select = -c(hadm_id, label_tsh, label_tg, label_tpa, label_t4,label_t3, label_t4_free))
head(df_unique,10)
## tsh_value_inNumeric tsh_value_unit tg_value_inNumeric tg_value_unit
## 1 3.80 uIU/mL NA <NA>
## 2 NA uU/ML NA <NA>
## 3 0.59 uU/ML NA <NA>
## 4 0.87 uIU/mL NA <NA>
## 5 3.00 uIU/mL NA <NA>
## 6 3.60 uIU/mL NA <NA>
## 7 2.10 uIU/mL NA <NA>
## 8 2.00 uIU/mL NA <NA>
## 9 2.90 uIU/mL NA <NA>
## 10 3.50 uIU/mL NA <NA>
## tpa_value_inNumeric tpa_value_unit t4_value_inNumeric t4_value_unit
## 1 NA <NA> NA <NA>
## 2 NA <NA> NA <NA>
## 3 NA <NA> NA <NA>
## 4 NA <NA> NA <NA>
## 5 NA <NA> NA <NA>
## 6 NA <NA> 19.8 ug/dL
## 7 NA <NA> NA <NA>
## 8 NA <NA> NA <NA>
## 9 NA <NA> NA <NA>
## 10 NA <NA> NA <NA>
## t3_value_inNumeric t3_value_unit t4_free_value_inNumeric t4_free_value_unit
## 1 NA <NA> NA <NA>
## 2 NA <NA> NA <NA>
## 3 NA <NA> 1.3 ng/dl
## 4 NA <NA> NA <NA>
## 5 NA <NA> NA <NA>
## 6 95 ng/dL 1.5 ng/dL
## 7 NA <NA> NA <NA>
## 8 NA <NA> NA <NA>
## 9 NA <NA> NA <NA>
## 10 NA <NA> NA <NA>
## age gender t3_medication t4_medication
## 1 35 1 No No
## 2 48 1 No No
## 3 27 2 No No
## 4 55 2 No No
## 5 54 2 No No
## 6 69 1 No No
## 7 62 1 No No
## 8 59 2 No No
## 9 74 1 No Yes
## 10 74 1 No Yes
Removing record with NULL tsh_value_unit:
df_unique <- df_unique[-c(10004), ]
df_unique$tsh_value_unit <- droplevels(df_unique$tsh_value_unit)
head(df_unique$tsh_value_unit,10)
## [1] uIU/mL uU/ML uU/ML uIU/mL uIU/mL uIU/mL uIU/mL uIU/mL uIU/mL uIU/mL
## Levels: uIU/mL uU/ML
Assigning if a patient has hypothyroidism based on the TSH value:
for(l in 1:nrow(df_unique)){
if(is.na(df_unique$tsh_value_inNumeric[l])){
df_unique$has_hypothyroidism[l] <- NA
}
else if (df_unique$tsh_value_inNumeric[l] >= 4.5){
df_unique$has_hypothyroidism[l] <- "Yes"
}
else{
df_unique$has_hypothyroidism[l] <- "No"
}
}
head(df_unique,10) # Check tsh value
## tsh_value_inNumeric tsh_value_unit tg_value_inNumeric tg_value_unit
## 1 3.80 uIU/mL NA <NA>
## 2 NA uU/ML NA <NA>
## 3 0.59 uU/ML NA <NA>
## 4 0.87 uIU/mL NA <NA>
## 5 3.00 uIU/mL NA <NA>
## 6 3.60 uIU/mL NA <NA>
## 7 2.10 uIU/mL NA <NA>
## 8 2.00 uIU/mL NA <NA>
## 9 2.90 uIU/mL NA <NA>
## 10 3.50 uIU/mL NA <NA>
## tpa_value_inNumeric tpa_value_unit t4_value_inNumeric t4_value_unit
## 1 NA <NA> NA <NA>
## 2 NA <NA> NA <NA>
## 3 NA <NA> NA <NA>
## 4 NA <NA> NA <NA>
## 5 NA <NA> NA <NA>
## 6 NA <NA> 19.8 ug/dL
## 7 NA <NA> NA <NA>
## 8 NA <NA> NA <NA>
## 9 NA <NA> NA <NA>
## 10 NA <NA> NA <NA>
## t3_value_inNumeric t3_value_unit t4_free_value_inNumeric t4_free_value_unit
## 1 NA <NA> NA <NA>
## 2 NA <NA> NA <NA>
## 3 NA <NA> 1.3 ng/dl
## 4 NA <NA> NA <NA>
## 5 NA <NA> NA <NA>
## 6 95 ng/dL 1.5 ng/dL
## 7 NA <NA> NA <NA>
## 8 NA <NA> NA <NA>
## 9 NA <NA> NA <NA>
## 10 NA <NA> NA <NA>
## age gender t3_medication t4_medication has_hypothyroidism
## 1 35 1 No No No
## 2 48 1 No No <NA>
## 3 27 2 No No No
## 4 55 2 No No No
## 5 54 2 No No No
## 6 69 1 No No No
## 7 62 1 No No No
## 8 59 2 No No No
## 9 74 1 No Yes No
## 10 74 1 No Yes No
Assigned ‘No’ for not having hypothyroidism and ‘Yes’ for having hypothyroidism
Checking how many NAs are present:
colSums(is.na(df_unique))
## tsh_value_inNumeric tsh_value_unit tg_value_inNumeric
## 192 98 12120
## tg_value_unit tpa_value_inNumeric tpa_value_unit
## 12113 12126 12095
## t4_value_inNumeric t4_value_unit t3_value_inNumeric
## 10421 10417 11009
## t3_value_unit t4_free_value_inNumeric t4_free_value_unit
## 10977 9149 9130
## age gender t3_medication
## 0 0 0
## t4_medication has_hypothyroidism
## 0 192
Removing NAs and unwanted variables:
df_unique <- df_unique[!is.na(df_unique$tsh_value_inNumeric),]
df_unique <- subset(df_unique, select = -c(tsh_value_unit, tg_value_inNumeric, tg_value_unit, tpa_value_inNumeric, tpa_value_unit))
df_unique <- df_unique[!is.na(df_unique$t4_free_value_inNumeric), ]
df_unique <- df_unique[!is.na(df_unique$t3_value_inNumeric),]
colSums(is.na(df_unique))
## tsh_value_inNumeric t4_value_inNumeric t4_value_unit
## 0 286 286
## t3_value_inNumeric t3_value_unit t4_free_value_inNumeric
## 0 0 0
## t4_free_value_unit age gender
## 0 0 0
## t3_medication t4_medication has_hypothyroidism
## 0 0 0
Creating final dataframe and dropping unused levels:
final_df <- df_unique
final_df$t4_value_unit <- droplevels(final_df$t4_value_unit)
final_df$t3_value_unit <- droplevels(final_df$t3_value_unit)
final_df$t4_free_value_unit <- droplevels(final_df$t4_free_value_unit)
final_df <- rename(final_df, c("t4_value_inNumeric"="t4_value", "t3_value_inNumeric"="t3_value", "t4_free_value_inNumeric"="t4_free_value"))
final_df <- subset(final_df, select = -c(tsh_value_inNumeric ,t3_value_unit, t4_free_value_unit, t4_value_unit))
final_df$gender <- as.factor(final_df$gender)
final_df$gender <- revalue(final_df$gender, c("1"="Female", "2"="Male"))
head(final_df,10)
## t4_value t3_value t4_free_value age gender t3_medication t4_medication
## 6 19.8 95 1.50 69 Female No No
## 12 4.5 143 0.70 1 Female No Yes
## 16 NA 98 0.80 56 Female No No
## 27 5.1 85 0.88 52 Male No No
## 40 7.4 55 1.30 48 Female No No
## 46 8.9 56 2.10 74 Female No No
## 55 NA 32 0.80 57 Female No No
## 91 NA 83 1.70 84 Female No No
## 93 NA 97 1.50 29 Female No No
## 101 18.9 363 5.60 20 Female No No
## has_hypothyroidism
## 6 No
## 12 Yes
## 16 No
## 27 No
## 40 No
## 46 No
## 55 No
## 91 Yes
## 93 No
## 101 No
Summary of final dataframe:
final_df$t3_medication <- factor(final_df$t3_medication)
final_df$t4_medication <- factor(final_df$t4_medication)
final_df$has_hypothyroidism <- factor(final_df$has_hypothyroidism)
summary(final_df)
## t4_value t3_value t4_free_value age gender
## Min. : 1.300 Min. : 23.00 Min. :0.170 Min. : 1.00 Female:362
## 1st Qu.: 4.350 1st Qu.: 49.00 1st Qu.:0.850 1st Qu.:52.00 Male :287
## Median : 5.700 Median : 63.00 Median :1.100 Median :66.00
## Mean : 5.942 Mean : 68.74 Mean :1.124 Mean :63.12
## 3rd Qu.: 7.100 3rd Qu.: 83.00 3rd Qu.:1.300 3rd Qu.:77.00
## Max. :22.200 Max. :363.00 Max. :6.200 Max. :89.00
## NA's :286
## t3_medication t4_medication has_hypothyroidism
## No :642 No :358 No :364
## Yes: 7 Yes:291 Yes:285
##
##
##
##
##
Correlation between variables using ranks for factor variables:
final_df_numeric <- final_df
final_df_numeric$has_hypothyroidism <- as.numeric(final_df_numeric$has_hypothyroidism)
final_df_numeric$gender <- as.numeric(final_df_numeric$gender)
final_df_numeric$t3_medication <- as.numeric(final_df_numeric$t3_medication)
final_df_numeric$t4_medication <- as.numeric(final_df_numeric$t4_medication)
c <- cor(final_df_numeric, use= "pairwise.complete.obs", method = "spearman")
c
## t4_value t3_value t4_free_value age
## t4_value 1.00000000 0.58161151 0.67166260 -0.015815016
## t3_value 0.58161151 1.00000000 0.36301347 -0.175462530
## t4_free_value 0.67166260 0.36301347 1.00000000 0.139524212
## age -0.01581502 -0.17546253 0.13952421 1.000000000
## gender -0.07315593 -0.00259187 -0.01631378 -0.023381097
## t3_medication -0.02051961 -0.03957619 -0.05029692 -0.002230081
## t4_medication -0.26299894 -0.23503402 -0.17557544 0.035772339
## has_hypothyroidism -0.15677556 -0.08155946 -0.22555479 0.089466104
## gender t3_medication t4_medication has_hypothyroidism
## t4_value -0.073155931 -0.020519607 -0.26299894 -0.15677556
## t3_value -0.002591870 -0.039576187 -0.23503402 -0.08155946
## t4_free_value -0.016313782 -0.050296915 -0.17557544 -0.22555479
## age -0.023381097 -0.002230081 0.03577234 0.08946610
## gender 1.000000000 -0.002869326 -0.10408907 -0.05646476
## t3_medication -0.002869326 1.000000000 0.05582933 0.02783459
## t4_medication -0.104089066 0.055829327 1.00000000 0.41334656
## has_hypothyroidism -0.056464756 0.027834589 0.41334656 1.00000000
t4_free_value and t4_value variables have a correlation of 0.67166260
t3_value and t4_value variables have a correlation of 0.58161151