-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add raw price threshold for sales val #142
Conversation
@@ -178,6 +178,7 @@ def classify_outliers(df, stat_groups: list, min_threshold): | |||
"sv_ind_ptax_flag_w_high_price_sqft": "High price per square foot", | |||
"sv_ind_price_low_price_sqft": "Low price per square foot", | |||
"sv_ind_ptax_flag_w_low_price_sqft": "Low price per square foot", | |||
"sv_ind_raw_price_threshold": "Raw price threshold", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assigning this reason with priority right after the other price flags that are generated with the statistical flagging groups.
Order of assignment between the three outlier reasons will be
- Price outlier
- Raw price outlier (this cap)
- Ptax
- Other char reasons
@@ -807,6 +807,10 @@ def outlier_type(df: pd.DataFrame, condos: bool) -> pd.DataFrame: | |||
"sv_ind_price_low_price_sqft", | |||
] | |||
|
|||
# Implement raw threshold, unlog price | |||
price_conditions.append((10 ** df["meta_sale_price"]) > 15_000_000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is pending a final decision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Do we also need to save this value to the parameter
table as a new column?
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
"min_group_thresh": [min_group_thresh], | ||
"raw_price_thresh": [raw_price_thresh], | ||
"short_term_owner_threshold": [short_term_threshold], | ||
"min_group_thresh": [min_group_threshold], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This min_group_thresh should eventually be migrated to min_group_threshold to maintain naming style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Just make sure that the threshold is correct before merging.
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
Co-authored-by: Jean Cochrane <jeancochrane@users.noreply.github.com>
In the current methodology, a sale with a price of 10 billion could make it into the model. This is because the statistical groups sales val uses to flag sales automatically include the sales if its group has less than 30 sales. We add a cap that automatically disqualifies sales regardless of the group threshold.
We also standardize instances of "thresh" naming to "threshold"