Sorry for opening yet another issue, but I recently ran into this.
There are situations where people would have a table (e.g. .csv with x, y (lon, lat)) coordinates that they want to import into R, do some calculations and then save it as .csv again.
st_write seems to fail on the .csv extension while write.csv works but creates the issue of separating the geomety column by a comma. Hence when reading back into R we get a wrong data frame structure.
library(sf)
library(ggmap)
data(crime)
crime_sf = st_as_sf(crime, coords = c("lon", "lat"), crs = 4326)
st_write(crime_sf, "crime.csv") # gives an error with sf version 0.4.2
write.csv(crime_sf, "crime.csv")
tst <- read.csv("crime.csv")
head(tst)
gives
X time date hour premise offense beat block street type
82729 2010-01-01 07:00:00 1/1/2010 0 18A murder 15E30 9600-9699 marlive ln -
82730 2010-01-01 07:00:00 1/1/2010 0 13R robbery 13D10 4700-4799 telephone rd -
82731 2010-01-01 07:00:00 1/1/2010 0 20R aggravated assault 16E20 5000-5099 wickview ln -
82732 2010-01-01 07:00:00 1/1/2010 0 20R aggravated assault 2A30 1000-1099 ashland st -
82733 2010-01-01 07:00:00 1/1/2010 0 20A aggravated assault 14D20 8300-8399 canyon -
82734 2010-01-01 07:00:00 1/1/2010 0 20R burglary 18F60 9300-9399 rowan ln -
suffix number month day location address geometry
82729 1 january friday apartment parking lot 9650 marlive ln c(-95.4373883 29.6779015)
82730 1 january friday road / street / sidewalk 4750 telephone rd c(-95.2988769 29.6917121)
82731 1 january friday residence / house 5050 wickview ln c(-95.455864 29.5992174)
82732 1 january friday residence / house 1050 ashland st c(-95.4033373 29.7902425)
82733 1 january friday apartment 8350 canyon c(-95.3779081 29.6706341)
82734 1 january friday residence / house 9350 rowan ln c(-95.5483009 29.7022336)
Am I missing an obvious solution?
What do you think about this solution?
library(sf)
library(tidyverse)
library(ggmap)
data(crime)
crime %>%
st_as_sf(., coords = c("lon", "lat"), crs = 4326) %>%
cbind(., st_coordinates(.)) %>%
st_set_geometry(NULL) %>%
write_csv(., 'my_file.csv')
Trying to save to any geospatial format raises an error on my side. I haven't investigated much further, but it might be another bug than GDAL's csv driver.
st_write(crime_sf, "crime.gpkg")
#'> Error in UseMethod("st_agr<-") :
#'> no applicable method for 'st_agr<-' applied to an object of class "data.frame" In addition: Warning message:
#'> In clean_columns(obj, factorsAsCharacter) :
#'> ignoring columns with unsupported type:
#'> timePOSIXt
> data(crime)
> class(crime$time)
[1] "POSIXt" "POSIXct"
which is VERY odd, if you ask me. Anyway, we now do this more robustly.
Raised the crime$time issue here.
@tim-salabim, this now seems to work...
library(sf)
library(ggmap)
data(crime)
crime_sf = st_as_sf(crime, coords = c("lon", "lat"), crs = 4326)
st_write(crime_sf, "crime.csv")
Do you see any coordinates in the resulting csv?
I don't. But I am guessing that it is documented behaviour of the GDAL CSV driver?
In general I am thinking that, given we cannot st_read("file.csv") directly even if it contains coordinates (or am I missing something?), why should we be able to st_write coordinates directly? Maybe some sort of st_as_csv function with arguments like drop_sf_coulumn and append_coordinates would be nice to have so that we can do
dframe = read.csv("file.csv")
dframe_sf = st_as_sf(dframe, coords = c("x", "y"), crs = 4326)
... <analysis>...
dframe_csv = st_as_csv(dframe_sf, drop_sf_column = TRUE, append_coordinates = TRUE)
write.csv(dframe_csv, "file.csv")
From my point of view that would make sense regarding the similarity of the approaches to read and write.
I'd rather shift the problem from writing csv to conversion from sf to data.frame while resolving the geometry list columns for POINT. E.g. by
st_as_xyz = function(x) data.frame(st_coordinates(x), st_set_geometry(x, NULL))
demo(meuse_sf)
summary(st_as_xyz(meuse_sf))
It probably should check that this concerns POINT geoms, anything else will fail.
Yes, this seems like a good approach! I'd be happy with this implementation. Maybe having an option to pass the colnames for the coordinate columns would be nice, but not that important. Will this also work with multiple geometry columns?
GDAL offers a few options for reading and writing csv:
st_write(crime_sf, "crime.csv", layer_options = "GEOMETRY=AS_XY")
Also
st_write(crime_sf, "crime_wkt.csv", layer_options = "GEOMETRY=AS_WKT")
x <- st_read("crime_wkt.csv", options = "GEOM_POSSIBLE_NAMES=WKT")
I haven't checked how AS_XY handles other geometries than points.
Here's my selection from the csv driver page:
GEOMETRY (Starting with GDAL 1.6.0): By default, the geometry of a feature written to a .csv file is discarded. It is possible to export the geometry in its WKT representation by specifying GEOMETRY=AS_WKT. It is also possible to export point geometries into their X,Y,Z components (different columns in the csv file) by specifying GEOMETRY=AS_XYZ, GEOMETRY=AS_XY or GEOMETRY=AS_YX. The geometry column(s) will be prepended to the columns with the attributes values.
X_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for X/longitude coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES must be both specified and a matching for each must be found in the columns of the CSV file. Only one geometry column per layer might be built when using X_POSSIBLE_NAMES/Y_POSSIBLE_NAMES.
Y_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for Y/latitude coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES must be both specified and a matching for each must be found in the columns of the CSV file.
Z_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for Z/elevation coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. Only taken into account in combination with X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES.
GEOM_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for geometry columns that contain geometry definitions encoded as WKT, WKB (in hexadecimal form, potentially in PostGIS 2.0 extended WKB) or GeoJSON. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle
KEEP_GEOM_COLUMNS=YES/NO (default YES) Expose the detected X,Y,Z or geometry columns as regular attribute fields.
The driver documentation nearly always brings relief and surprise.
@tim-salabim , this settles the issue, then?
Indeed!
Most helpful comment
GDAL offers a few options for reading and writing csv:
Also
I haven't checked how
AS_XYhandles other geometries than points.Here's my selection from the csv driver page:
Writing
Reading