Windows 7 64 bit, MRO 3.3.2, latest Github version of sf.
nc = st_read(system.file("shape/nc.shp", package="sf"))
st_write(nc, "test.shp")
st_write(nc, "test.shp", update = TRUE)
list.files(pattern = "test.shp")
returns
[1] "test.shp" "test.shp.dbf" "test.shp.prj" "test.shp.shp" "test.shp.shx"
where only "test.shp" was expected.
I'm afraid your expectations were too optimistic. sf still has no support for deleting data sources before writing.
Well then isn't this description of the update option in the documentation slightly misleading?
For non database-type drivers that save a single layer (e.g. ESRI Shapefile and GeoJSON) this is roughly equivalent to overwrite in functions such as writeRaster from the raster package.
For reference, overwrite in writeRaster:
overwrite: Logical. If TRUE, "filename" will be overwritten if it exists
In any case, is the addition of an extra .shp in the filename intentional, and if so, why?
This description comes from @Robinlovelace , see #274, and might still reflect the same optimism that update would automatically overwrite. Noone has implemented a delete operation (which gdal supports) so far; rgdal::writeOGR has it.
The addition of the extra .shp is what GDAL does; see http://www.gdal.org/drv_shapefile.html
You might get further by using the dsn, layer and driver arguments, but I don't see how we can get st_write cause the removal of shapefiles before writing.
if (file.exists(dsn) && update) unlink(dsn) ?
That's what I do in functions that would otherwise fail. Though this is not quite correct for multi-file types such as .shp, GDAL seems to have no problems with overwriting the associated files if they exist. Though I have not extensively tested this!
This approach has typically had issues on Windows platforms - if you care to look at rgrass7 and earlier attempts to unlink files on Windows that may have been owned by other processes. You need to use the GDAL functions, as in rgdal, and be very careful with destroying a whole directory when dsn is not a file. It is not at all simple to get right cross-platform.
Please also go through #274 to understand why such a solution can't be part of sf.
@rsbivand thanks for sharing this insight! I'll make sure to test whether dsn is a directory before unlinking.
file_test("-d", dsn) would be sufficient I assume.
@edzer I agree that a GDAL based solution should be the way to achieve this. I personally have no problems with handling this myself.
@tim-salabim I applaud any effort to implement this! In the meantime, should the docs be made more consistent with actual behavior?
@eivindhammers don't misunderstand me, I am not planning to implement this in sf. I merely mean that I have no problems with handling this in my own scripts/functions.
Just to say: yes that was me who wrote the demonstrably incorrect documentation. Apologies. I did not test on Windows and subsequently realised that does not behave as stated for .geojsons either I don't think. @tim-salabim good timing with you latest comment: I was about to say I would be up for helping you implement that. I think a GDAL delete interface would be best but do not currently have the skill/capacity to do this. Heads up @Nowosad who has also discussed this with me and Edzer.
st_write(nc, "test.shp", delete_dsn = TRUE)
will now first delete _data source_ test.shp (which might be a directory) before attempting to write. I added a parameter delete_layer to only delete the layer, but haven't implemented it. For this use case, deleting the dsn seems to work.
Great stuff. From a user's perspective it may be more useful to have delete_dsn = TRUE if overwrite = TRUE (doable?) but can see why this new parameter has been added: to keep sf as faithful to GDAL as possible.
Out of interest, where are GDAL's deleting capabilities/args documented? This is the best I can find (always struggle to find any docs about GDAL - probably not looking hard enough!): http://www.gdal.org/gdalmanage.html
We now have tidy versions
> library(sf)
Linking to GEOS 3.5.1, GDAL 2.1.3, proj.4 4.9.2, lwgeom 2.3.2 r15302
> nc = read_sf(system.file("shape/nc.shp", package="sf"))
> write_sf(nc, "test.shp")
> write_sf(nc, "test.shp")
that are silent and overwrite (have delete_layer=TRUE) by default.
The original construct with update = TRUE now leads to an error; update is for adding layers to existing datasources (like adding a table to an existing database), not for overwriting.
Nice! Great work, thanks!
Great stuff!
Awesome work, that will make life much easier for many people!
Most helpful comment
We now have tidy versions
that are silent and overwrite (have
delete_layer=TRUE) by default.The original construct with
update = TRUEnow leads to an error; update is for adding layers to existing datasources (like adding a table to an existing database), not for overwriting.