Dataframes.jl: Port dependencies to Missing and CategoricalArray

Created on 16 Sep 2017  路  25Comments  路  Source: JuliaData/DataFrames.jl

This issue lists packages which depend on DataFrames, with links to PRs which port them to the new Missing and CategoricalArray based DataFrames. Please add new PRs as they are opened to track progress, and check boxes once they have been merged. Some packages may not need any adjustments and could be moved to a separate section.

Packages with open PRs (tick the box once PR is merged):

Packages which do not need any changes:

  • [x] StatPlots
  • [x] Twitter

Packages which need new PRs:

  • [ ] BeaData
  • [ ] Bedgraph
  • [ ] BedgraphFiles
  • [ ] Benchmark
  • [ ] BioMedQuery
  • [ ] BlsData
  • [ ] Bootstrap
  • [ ] Brim
  • [ ] COMTRADE
  • [ ] Celeste
  • [ ] CrossfilterCharts
  • [ ] DBFTables
  • [ ] DSGE
  • [ ] DataCubes
  • [ ] Deldir
  • [ ] DimensionalityReduction
  • [ ] Diversity
  • [ ] EEG
  • [ ] EasyPhys
  • [ ] EconDatasets
  • [ ] EconModels
  • [ ] EventHistory
  • [ ] ExperimentalAnalysis
  • [ ] FaSTLMM
  • [ ] FinancialMarkets
  • [ ] FredData
  • [ ] GeoIP
  • [ ] Gillespie
  • [ ] GoogleCharts
  • [ ] Graft
  • [ ] GraphGLRM
  • [ ] Hive
  • [ ] InteractiveFixedEffectModels
  • [ ] JWAS
  • [ ] JuliaFEM
  • [ ] LazyQuery
  • [ ] LifeTable
  • [ ] LinguisticData
  • [ ] LowRankModels
  • [ ] MLDataUtils
  • [ ] MachineLearning
  • [ ] Mads
  • [ ] mPulseAPI
  • [ ] MultidimensionalTables
  • [ ] NLOptControl
  • [ ] NLreg
  • [ ] OdsIO
  • [ ] OpenGene
  • [ ] OpenSecrets
  • [ ] OptiMimi
  • [ ] PGFPlots
  • [ ] PhyloNetworks
  • [ ] PrettyPlots
  • [ ] ProjectTemplate
  • [ ] PySide
  • [ ] Q
  • [ ] ROCAnalysis
  • [ ] Resampling
  • [ ] Rif
  • [ ] Robotlib
  • [ ] RobustStats
  • [ ] SloanDigitalSkySurvey
  • [ ] Sparrow
  • [ ] SpatialEcology
  • [ ] SpeedDate
  • [ ] StackedNets
  • [ ] TermWin
  • [ ] TextAnalysis
  • [ ] TimeData
  • [ ] TimeSeriesIO
  • [ ] ValueOrientedRiskManagementInsurance
  • [ ] WorldBankData
  • [ ] XSim
  • [ ] ZipCode

Abandoned packages:

  • [x] kNN

Packages that already did not load on Julia 0.6:

  • [x] Augur
  • [x] BIGUQ
  • [x] BayesNets
  • [x] BayesianDataFusion

Most helpful comment

Carrying on with this "clean slate" approach (I can move this to another issue if desired). After setting the environment variable JULIA_PKGDIR to, say, ~/.juliaDF11 and starting julia, run

Pkg.init()
Pkg.add("DataFrames")
Pkg.checkout("DataFrames")
versioninfo(true)

the output on packages is

Package Directory: /home/bates/.juliaDF11/v0.6
1 required packages:
 - DataFrames                    0.10.1+            master
13 additional packages:
 - BinDeps                       0.6.0
 - CategoricalArrays             0.2.0
 - Compat                        0.31.0
 - DataStreams                   0.2.0
 - DataStructures                0.7.1
 - Nulls                         0.0.6
 - Reexport                      0.0.3
 - SHA                           0.5.1
 - SortingAlgorithms             0.1.1
 - SpecialFunctions              0.3.3
 - StatsBase                     0.19.0
 - URIParser                     0.2.0
 - WeakRefStrings                0.3.1

Do any of the additional packages need to have a Pkg.checkout before I continue?

All 25 comments

Oh man.

I can probably take a look at Bootstrap and Resampling next week. I've been wanting to transfer Bootstrap over to JuliaStats anyways (assuming the author is open to that).

I think we should get an initial tag out of the port; trying to update all this code and have travis runs isn't possible unless we're able to test against current DataFrames. We don't need to announce anything yet and make sure we have really tight bounds on everything, but I think we need to tag sooner rather than later.

I think we should get an initial tag out of the port; trying to update all this code and have travis runs isn't possible unless we're able to test against current DataFrames.

That's possible by adding an explicit call to Pkg.checkout("DataFrames") in .travis.yml (see JuliaStats/RData.jl#28). Not pretty, but as a temporary way of checking the PR it works.

I think we should have the discussion about when to tag the port somewhere else, maybe in https://github.com/JuliaData/DataFrames.jl/pull/1209 since it has started there?

I think it would be useful to write up some guidelines on how to convert a dependent package to the new version of DataFrames, CategoricalArrays, etc. To even be able to install the master version of DataFrames I ended up starting from a clean package directory by setting the environment variable JULIA_PKGDIR. I don't know if this is a good approach or not but I do know that it is easy to find yourself in a "twisty maze of passages" (so who is old enough to know where that phrase originated?).

Carrying on with this "clean slate" approach (I can move this to another issue if desired). After setting the environment variable JULIA_PKGDIR to, say, ~/.juliaDF11 and starting julia, run

Pkg.init()
Pkg.add("DataFrames")
Pkg.checkout("DataFrames")
versioninfo(true)

the output on packages is

Package Directory: /home/bates/.juliaDF11/v0.6
1 required packages:
 - DataFrames                    0.10.1+            master
13 additional packages:
 - BinDeps                       0.6.0
 - CategoricalArrays             0.2.0
 - Compat                        0.31.0
 - DataStreams                   0.2.0
 - DataStructures                0.7.1
 - Nulls                         0.0.6
 - Reexport                      0.0.3
 - SHA                           0.5.1
 - SortingAlgorithms             0.1.1
 - SpecialFunctions              0.3.3
 - StatsBase                     0.19.0
 - URIParser                     0.2.0
 - WeakRefStrings                0.3.1

Do any of the additional packages need to have a Pkg.checkout before I continue?

That looks right to me.

StatPlots is compatible with DataFrames via the IterableTables package, and given that IterableTables has already updated, it doesn't need a PR.

Quandl: milktrader/Quandl.jl#111 has been merged into master.

Please mark GeoStatsDevToosl.jl as fixed. I am just waiting for the new tag to be merged on METADATA.jl and then I will tag a new version of GeoStats.jl. It is currently broken with the update.

If anyone is seeing this: If you can't install DataFrames 11, it may be because of DataTables. For me that was what was holding my upgrade! After I removed DataTables Dataframes updated!

UAParser.jl updated for missings, just needs to be merged on METADATA

LogParser.jl update is at METADATA, just needs to be merged

Twitter.jl technically already works, but updates coming with https://github.com/randyzwitch/Twitter.jl/pull/17 to make DataFrames 0.11 a requirement

ReadStat.jl no longer depends on DataFrames, so it can be removed from the list.

IterableTables.jl, QueryOperators.jl and Query.jl should also all work with DataFrames v0.11 at this point, so no further work needed there.

I am trying to migrate but it's more complicated than what I expected.
Could someone give me a hand with this particular function?
https://discourse.julialang.org/t/tanslate-code-from-pooleddataarray-to-categoricalarray/8646

Could you uncheck the Gadfly box as the PR remains unmerged?
It took me a while to find out that it was the one holding back my update. Thanks.

It's not checked, an IIRC it wasn't a few hours ago either, was it?

I would have sworn it wasn't, but perhaps I'm mistaken, in which case sorry for the noise.

Alright, ExcelFiles.jl and ExcelReaders.jl can (finally) be marked as resolved on this list. ExcelReaders.jl no longer depends on DataFrames.jl, and ExceFiles.jl no longer on DataTables.jl, and both packages should work fine with the new v0.11 now.

Mimi.jl can also be marked as solved.

Closing since packages which haven't been ported don't work on Julia 1.0, so they have more serious problems.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tlienart picture tlienart  路  8Comments

mattBrzezinski picture mattBrzezinski  路  5Comments

ahalwright picture ahalwright  路  3Comments

xiaodaigh picture xiaodaigh  路  7Comments

gustafsson picture gustafsson  路  6Comments