Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try neural network for CSO forecasting #9

Open
amfrandolph opened this issue Jan 21, 2015 · 6 comments
Open

Try neural network for CSO forecasting #9

amfrandolph opened this issue Jan 21, 2015 · 6 comments

Comments

@amfrandolph
Copy link
Contributor

This was a suggestion from Seb.

"Zane's R code is using a linear combination of many variables.
yhat=b0 + b1_x1+ b2_x2 + ... bp*xp

But the learner could very well be of the form
yhat=b0 + b1_x1^c1+ b2_x2^c2 + ... bp*xp^c3
where c1, c2, and c3 are exponents> 1.

I understand that in our case we have many inputs (the x_i, precipitation values for each month at each site) and that the output y is the number of overflows (CSOs).

I think that maybe a neural network would be better at the problem of CSO forecasting since it can
presumably explore a more diverse landscape of behaviors (such as the non-linear ones).

@sebhtml
Copy link
Member

sebhtml commented Jan 21, 2015

I will try Torch:

I believe that it is programmable in Lua.

I will first go through this example:
https://github.com/nicholas-leonard/dp/blob/master/doc/neuralnetworktutorial.md

Also, I will figure out the input format required by Torch and then I will convert
the data in SewageModel/data/ to the required format (so-called Data Janitor Task).

@sebhtml
Copy link
Member

sebhtml commented Jan 21, 2015

Indicators of progress:

@sebhtml
Copy link
Member

sebhtml commented Jan 24, 2015

@zscore Both files (munged_data.RDS and transformed_precip.RDS) have 67912 lines. The line i in the first file is paired with the line i in the second file, right ?

@sebhtml
Copy link
Member

sebhtml commented Jan 24, 2015

@zscore In munged_data.RDS, the columns starting at "segment_1" until "Wilmette DS-M114N-2" are names of places where CSO can occur, right ? 0 means normal and 1 means overflow. Is that correct ?

@amfrandolph also, in transformed_precip.RDS there are columns with similar names. For example: ord_precip_1, ord_precip_2, ord_precip_97, and so on. I suppose that "ord" is for the airport. What is the meaning of the number at the end (1, 2, 97, and so on) ?

The input values contains precipitation values (67912 examples). The output values are the sewage overflows (segment_* or other stranger names).

@amfrandolph
Copy link
Contributor Author

@seb I don't have the answer to question about the column names. Scott
may, or can give us lead to who originally wrote the code. I would be
glad to help write code book that defines our variables, as we keep
learning.

  • Alan

On Sat, Jan 24, 2015 at 4:31 PM, Sébastien Boisvert <
[email protected]> wrote:

@zscore https://github.com/zscore In munged_data.RDS, the columns
starting at "segment_1" until "Wilmette DS-M114N-2" are names of places
where CSO can occur, right ? 0 means normal and 1 means overflow. Is that
correct ?

@amfrandolph https://github.com/amfrandolph also, in
transformed_precip.RDS there are columns with similar names. For example:
ord_precip_1, ord_precip_2, ord_precip_97, and so on. I suppose that "ord"
is for the airport. What is the meaning of the number at the end (1, 2, 97,
and so on) ?

The input values contains precipitation values (67912 examples). The
output values are the sewage overflows (segment_* or other stranger names).


Reply to this email directly or view it on GitHub
#9 (comment).

@sebhtml
Copy link
Member

sebhtml commented Jan 26, 2015

Zane said that he was able to make 'glm' converge by using a lower number of predictors (he said that there were issues when there are too many correlated predictors).

In the paper "Hydrologic and Hydraulic Modeling of the Tunnel and Reservoir Plan.pdf", they focused on dropshaft CDS-51.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants