Talk:Pliciloricus
Appearance
This article is rated Stub-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||
|
Q4
[edit]# Q4 - Assessing the impact (15 points)
# To assess the impact from air quality, we will explore the aggregated data
# from Behavioral Risk Factor Surveillance System (BRFSS) in brfss_06_20.json.
# The structure of the data is:
# yearmetric name{state code: average days out of 30 across all individuals}
# Please wrangle this data into a data frame, called brfss_df, where each row is a state and year combination,
# and the columns are year, state_code, and every possible health metric that is in the data.
# Here are the definitions for the metrics:
# "energetic_days": "How many days full of energy in past 30 days"
# "bad_mental_health_days": "Now thinking about your mental health,
# which includes stress, depression, and problems with emotions, for how many days during the past 30 days was your mental health not good?"
# "bad_physical_health_days": "Now thinking about your physical health, which includes physical illness and injury,
# for how many days during the past 30 days was your physical health not good?"
# Please report:
# the number of rows and columns in brfss_dfthe average for each health metric across all states and all years in brfss_df
# make sure that the averages are sensible, different years may have different metrics or different ways of handling the data
# Using negative values when the metric must be non-negative is an old way of encoding missing values.
import json
import pandas as pd
import numpy as np
with open('brfss_06_20.json', 'r') as f:
brfss_data = json.load(f)
data_list = []
for year, metrics in brfss_data.items():
for metric, state_data in metrics.items():
for state_code, value in state_data.items():
data_list.append({
'year': int(year),
'state_code': state_code,
metric: float(value) if value >= 0 else np.nan
})
brfss_df = pd.DataFrame(data_list)
brfss_df = brfss_df.pivot_table(
index=['year', 'state_code'],
columns='variable',
values=['energetic_days', 'bad_mental_health_days', 'bad_physical_health_days'],
aggfunc='first'
).reset_index()
brfss_df.columns = [' '.join(col).strip() for col in brfss_df.columns.values]
print(f"#rows in brfss_df: {brfss_df.shape[0]}, columns: {brfss_df.shape[1]}")
for metric in ['energetic_days', 'bad_mental_health_days', 'bad_physical_health_days']:
avg = brfss_df[metric].mean()
print(f"Average {metric}: {avg:.2f}")
这段代码没有经过调试,因为缺brfss_06_20.json,如果没有发给我的话请自行调试 160.30.98.69 (talk) 01:14, 7 December 2024 (UTC)