-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationale - The Second Deedle #13
Comments
Good question. I know there will be some duplication of work, but there are some differences between the two projects, and users can choose according to their own situation. I just came the some reasons.
@dotChris90 Do you have more? @pkese Hope we can find some common ground and cooperate, we are all advocates of .net. |
@Oceania2018 Good. I'm all in if you wish to improve upon Deedle. I have however found that the amount of annoyances in Deedle are approximately the same as in Pandas. With Pandas you have a bit larger surface area, whereas with Deedle you have to deal with types bit more. Deedle can have smaller surface area, because you can simply do a Regarding performance issues Deedle is on average good enough and I'm not sure you can beat it in any substantial way. The main thing is that Deedle is way faster than Pandas (in my experience, what took 4 hours in Pandas took just 5 minutes in Deedle). Adding a few percent more on that is negligible. My main worry however is that there are 5 or 6 big SciSharp projects (Pandas.Net, Tensorflow, NumSharp, SciSharpLearn, etc...) with something like 3 active developers behind. It is a rather large surface area to cover by such a small team and there should be a solid reason for people to switch or start participating on your project rather than on more established projects with existing larger community participation and solid documentation. And with the word 'reasons', I mean reasons besides not-invented-here, or not-Pythonic-enough. If you can't provide a good answer to such questions, you will be unlikely to gain much community support and without community you will eventually give up - wasting your (and even other people's) time. On the other hand, if you provide excellent answers to those questions, people might prefer to contribute their time and code to your project rather than to Deedle. |
NumSharp adopt serveral providers, default is implemented by pure C# (worst performance). Deedle use |
The only thing I know now here is that NumSharp using specific NDArrays - not .NET Arrays. Deedle using .NET arrays (correct me if I am wrong). Also could maybe talk to the FSLab community in general if they are interested in a Scipy like stack. And I think when tried out Deedle it did not work well in Powershell (so an other .NET language) - but before somebody complain - This could be related to Powershell import mechanism - not sure. |
@Oceania2018 lol want to say the same like you now. |
Wonderful. That's exactly the stuff that you need to expose a bit more and put in front. |
@pkese Deedle use |
@Oceania2018 yes yes - but i have to admit @pkese is right - we need to extend the readme. Otherwise people think "yes this is a 2nd Deedle" |
or people think "why the guys make a 2nd deedle" |
We pursue a Python-like experience, just as smooth as python when you do Machine Learning in .NET. @pkese The other point. @dotChris90 Yes, we use explain more in ReadMe. |
I don't have enough time to join a detailed discussion, but saying Deedle uses objects everywhere is not right. When you have a column of floating point values, the data is actually stored as |
Deedle is ok but it suffers from poor performance for larger datasets. I found the Extreme Optimization library to be far more performant (order of magnitude at least) when I last compared them. However one is free, the other licensed. Deedle is a small dataset only solution and in no way comparable to Pandas or where Pandas is headed. NB When I'm talking large datasets I'm only looking at millions of rows so not even big data. Deedle is palatable for perhaps thousands/tens of thousands of rows. |
Interfaces of Deedle is so different from Pandas, a .Net ported verison of Pandas is absolutely necessary to use achievements of Python.After all, IronPython cannot be used as a version of Python. |
Recently, I found it seems the next version of project pythonnet my solve many problems about interoperability of C# with Python. |
@lidanger have you set it up in pythonnet? |
I have used pandas and other Python packages in pythonnet for several months. The version 2.3 is not so good for multi-platform, but 2.4 has made great progress. Althouth it has not been released, I used it well in my project with target framework .net core 2.1 and .net framework 4.6.1 these days. |
Please provide rationale somewhere in main README, why using/developing/participating in this project rather than Deedle.
The text was updated successfully, but these errors were encountered: