Sf: saving sf POINT object as csv

Created on 29 Mar 2017  路  12Comments  路  Source: r-spatial/sf

Sorry for opening yet another issue, but I recently ran into this.
There are situations where people would have a table (e.g. .csv with x, y (lon, lat)) coordinates that they want to import into R, do some calculations and then save it as .csv again.

st_write seems to fail on the .csv extension while write.csv works but creates the issue of separating the geomety column by a comma. Hence when reading back into R we get a wrong data frame structure.

library(sf)
library(ggmap)

data(crime)
crime_sf = st_as_sf(crime, coords = c("lon", "lat"), crs = 4326)

st_write(crime_sf, "crime.csv") # gives an error with sf version 0.4.2
write.csv(crime_sf, "crime.csv")

tst <- read.csv("crime.csv")

head(tst)

gives

                        X     time date hour            premise offense      beat     block street type
82729 2010-01-01 07:00:00 1/1/2010    0  18A             murder   15E30 9600-9699   marlive     ln    -
82730 2010-01-01 07:00:00 1/1/2010    0  13R            robbery   13D10 4700-4799 telephone     rd    -
82731 2010-01-01 07:00:00 1/1/2010    0  20R aggravated assault   16E20 5000-5099  wickview     ln    -
82732 2010-01-01 07:00:00 1/1/2010    0  20R aggravated assault    2A30 1000-1099   ashland     st    -
82733 2010-01-01 07:00:00 1/1/2010    0  20A aggravated assault   14D20 8300-8399    canyon           -
82734 2010-01-01 07:00:00 1/1/2010    0  20R           burglary   18F60 9300-9399     rowan     ln    -
      suffix  number  month                      day          location       address     geometry
82729      1 january friday    apartment parking lot   9650 marlive ln c(-95.4373883  29.6779015)
82730      1 january friday road / street / sidewalk 4750 telephone rd c(-95.2988769  29.6917121)
82731      1 january friday        residence / house  5050 wickview ln  c(-95.455864  29.5992174)
82732      1 january friday        residence / house   1050 ashland st c(-95.4033373  29.7902425)
82733      1 january friday                apartment       8350 canyon c(-95.3779081  29.6706341)
82734      1 january friday        residence / house     9350 rowan ln c(-95.5483009  29.7022336)

Am I missing an obvious solution?

Most helpful comment

GDAL offers a few options for reading and writing csv:

st_write(crime_sf, "crime.csv", layer_options = "GEOMETRY=AS_XY")

Also

st_write(crime_sf, "crime_wkt.csv", layer_options = "GEOMETRY=AS_WKT")
x <- st_read("crime_wkt.csv", options = "GEOM_POSSIBLE_NAMES=WKT")

I haven't checked how AS_XY handles other geometries than points.

Here's my selection from the csv driver page:

Writing

GEOMETRY (Starting with GDAL 1.6.0): By default, the geometry of a feature written to a .csv file is discarded. It is possible to export the geometry in its WKT representation by specifying GEOMETRY=AS_WKT. It is also possible to export point geometries into their X,Y,Z components (different columns in the csv file) by specifying GEOMETRY=AS_XYZ, GEOMETRY=AS_XY or GEOMETRY=AS_YX. The geometry column(s) will be prepended to the columns with the attributes values.

Reading

X_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for X/longitude coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES must be both specified and a matching for each must be found in the columns of the CSV file. Only one geometry column per layer might be built when using X_POSSIBLE_NAMES/Y_POSSIBLE_NAMES.

Y_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for Y/latitude coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES must be both specified and a matching for each must be found in the columns of the CSV file.

Z_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for Z/elevation coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. Only taken into account in combination with X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES.

GEOM_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for geometry columns that contain geometry definitions encoded as WKT, WKB (in hexadecimal form, potentially in PostGIS 2.0 extended WKB) or GeoJSON. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle

KEEP_GEOM_COLUMNS=YES/NO (default YES) Expose the detected X,Y,Z or geometry columns as regular attribute fields.

All 12 comments

What do you think about this solution?

library(sf)
library(tidyverse)
library(ggmap)

data(crime)
crime %>%
        st_as_sf(., coords = c("lon", "lat"), crs = 4326) %>% 
        cbind(., st_coordinates(.)) %>% 
        st_set_geometry(NULL) %>% 
        write_csv(., 'my_file.csv')

Trying to save to any geospatial format raises an error on my side. I haven't investigated much further, but it might be another bug than GDAL's csv driver.

st_write(crime_sf, "crime.gpkg")
#'>  Error in UseMethod("st_agr<-") : 
#'>   no applicable method for 'st_agr<-' applied to an object of class "data.frame" In addition: Warning message:
#'> In clean_columns(obj, factorsAsCharacter) :
#'>   ignoring columns with unsupported type:
#'> timePOSIXt
> data(crime)
> class(crime$time)
[1] "POSIXt"  "POSIXct"

which is VERY odd, if you ask me. Anyway, we now do this more robustly.

Raised the crime$time issue here.

@tim-salabim, this now seems to work...

library(sf)
library(ggmap)
data(crime)
crime_sf = st_as_sf(crime, coords = c("lon", "lat"), crs = 4326)

st_write(crime_sf, "crime.csv")

Do you see any coordinates in the resulting csv?

I don't. But I am guessing that it is documented behaviour of the GDAL CSV driver?
In general I am thinking that, given we cannot st_read("file.csv") directly even if it contains coordinates (or am I missing something?), why should we be able to st_write coordinates directly? Maybe some sort of st_as_csv function with arguments like drop_sf_coulumn and append_coordinates would be nice to have so that we can do

dframe = read.csv("file.csv")
dframe_sf = st_as_sf(dframe, coords = c("x", "y"), crs = 4326)
... <analysis>...
dframe_csv = st_as_csv(dframe_sf, drop_sf_column = TRUE, append_coordinates = TRUE)
write.csv(dframe_csv, "file.csv")

From my point of view that would make sense regarding the similarity of the approaches to read and write.

I'd rather shift the problem from writing csv to conversion from sf to data.frame while resolving the geometry list columns for POINT. E.g. by

st_as_xyz = function(x) data.frame(st_coordinates(x), st_set_geometry(x, NULL))
demo(meuse_sf)
summary(st_as_xyz(meuse_sf))

It probably should check that this concerns POINT geoms, anything else will fail.

Yes, this seems like a good approach! I'd be happy with this implementation. Maybe having an option to pass the colnames for the coordinate columns would be nice, but not that important. Will this also work with multiple geometry columns?

GDAL offers a few options for reading and writing csv:

st_write(crime_sf, "crime.csv", layer_options = "GEOMETRY=AS_XY")

Also

st_write(crime_sf, "crime_wkt.csv", layer_options = "GEOMETRY=AS_WKT")
x <- st_read("crime_wkt.csv", options = "GEOM_POSSIBLE_NAMES=WKT")

I haven't checked how AS_XY handles other geometries than points.

Here's my selection from the csv driver page:

Writing

GEOMETRY (Starting with GDAL 1.6.0): By default, the geometry of a feature written to a .csv file is discarded. It is possible to export the geometry in its WKT representation by specifying GEOMETRY=AS_WKT. It is also possible to export point geometries into their X,Y,Z components (different columns in the csv file) by specifying GEOMETRY=AS_XYZ, GEOMETRY=AS_XY or GEOMETRY=AS_YX. The geometry column(s) will be prepended to the columns with the attributes values.

Reading

X_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for X/longitude coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES must be both specified and a matching for each must be found in the columns of the CSV file. Only one geometry column per layer might be built when using X_POSSIBLE_NAMES/Y_POSSIBLE_NAMES.

Y_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for Y/latitude coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES must be both specified and a matching for each must be found in the columns of the CSV file.

Z_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for Z/elevation coordinate of a point. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle. The values in the column must be floating point values. Only taken into account in combination with X_POSSIBLE_NAMES and Y_POSSIBLE_NAMES.

GEOM_POSSIBLE_NAMES=list_of_names. (GDAL >= 2.1) Comma separated list of possible names for geometry columns that contain geometry definitions encoded as WKT, WKB (in hexadecimal form, potentially in PostGIS 2.0 extended WKB) or GeoJSON. Each name might be a pattern using the star character in starting and/or ending position. E.g.: prefix, *suffix or *middle

KEEP_GEOM_COLUMNS=YES/NO (default YES) Expose the detected X,Y,Z or geometry columns as regular attribute fields.

The driver documentation nearly always brings relief and surprise.

@tim-salabim , this settles the issue, then?

Indeed!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

faridcher picture faridcher  路  4Comments

jmsigner picture jmsigner  路  4Comments

jsta picture jsta  路  4Comments

Nosferican picture Nosferican  路  3Comments

gregmacfarlane picture gregmacfarlane  路  4Comments