Skip to content

Conversation

@janezd
Copy link
Contributor

@janezd janezd commented Jul 11, 2022

  • Added tests for computation
  • Changed treatment of nans for discrete values:
    • if value is nan, it doesn't mean it does not contain particular substrings, so newly created values must be nan.
    • This required too many changes wrt to string variables, so I created a separate class for one hot encoding of discrete values,
    • which are now computed faster (each distinct value is split only once and updated for all appearances),
    • so there is also need for shared computation because it only wastes memory.
  • Renamed widget and modules.
Includes
  • Code changes
  • Tests
  • Documentation

@janezd janezd force-pushed the split-polish branch 3 times, most recently from 63544d9 to 4a0756f Compare July 12, 2022 11:53
@ajdapretnar
Copy link
Contributor

Counting nans works for discrete but not for string.

Screenshot 2022-07-18 at 15 50 01

@janezd
Copy link
Contributor Author

janezd commented Jul 20, 2022

Counting nans works for discrete but not for string.

I don't get this one. I removed the column with counts of missing values.

Trying this I discovered and fixed another bug - if categorical variable was a meta, the widget crashed.

@janezd
Copy link
Contributor Author

janezd commented Oct 23, 2022

I don't get this one. I removed the column with counts of missing values.

Maybe I do - maybe I just fell into the same trap. Have you been testing Split or Text to Columns? I renamed the widget. :)

@janezd janezd force-pushed the split-polish branch 2 times, most recently from a88258a to 7484867 Compare October 23, 2022 14:30
@erikafuna
Copy link

@ajdapretnar and @janezd, we forgot about this one, didn't we?

@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2023

Codecov Report

❌ Patch coverage is 93.02326% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 35.99%. Comparing base (5ef0d45) to head (a74ea06).
⚠️ Report is 49 commits behind head on master.

Files with missing lines Patch % Lines
...rangecontrib/prototypes/widgets/owtexttocolumns.py 93.02% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #253      +/-   ##
==========================================
+ Coverage   33.92%   35.99%   +2.06%     
==========================================
  Files           9       10       +1     
  Lines        1515     1478      -37     
==========================================
+ Hits          514      532      +18     
+ Misses       1001      946      -55     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@janezd
Copy link
Contributor Author

janezd commented Nov 21, 2023

Not at all. I seldom go to sleep without thinking about it, and @ajdapretnar and I discuss it at least once a week, sometimes even daily.

Seriously, how did you stumble upon this? Yes, it could be merged, I think. I also updated it (now that biolab/orange3#6058 is merged and released). @ajdapretnar, check and merge at will. We can also move it from prototypes; as I recall, it is used for some fields from google forms.

@markotoplak
Copy link
Member

Failing tests are also not an issue here and we'll fix them later. @erikafuna found a question posted on FB regarding this widget and she asked me which widget that was. Then we searched and found this. Merging. If @ajdapretnar has any comments, let's fix them later.

@markotoplak markotoplak changed the title Split: Refactor for discrete values, add tests, rename Split: Refactor for discrete values, add tests, rename to "Text to Columns" Nov 22, 2023
@markotoplak markotoplak merged commit f08b42f into biolab:master Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants