Hi,
I have some problems with this code:
w = pysal.queen_from_shapefile("zats.shp", "zat_id")
w.transform = 'v'
and this result:
WARNING: there are 43 disconnected observations
('Island ids: ', [58, 141, 270, 318, 323, 334, 383, 407, 422, 503, 524, 535, 536, 543, 545, 562, 573, 605, 622, 624, 666, 670, 703, 754, 939, 940, 941, 942, 943, 944, 946, 947, 950, 951, 952, 954, 955, 957, 964, 970, 972, 986, 993])
which is not true. As you can see, zat_id = 58 (red one) is next to other two regions (e.g. 57)
Why? Thanks
import os; print(os.name, os.sys.platform);print(os.uname())
posix darwin
posix.uname_result(sysname='Darwin', nodename='Marcos-MacBook-Air-2.local', release='15.3.0', version='Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64', machine='x86_64')
import sys; print(sys.version)
3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
import scipy; print(scipy.__version__)
0.18.1
import numpy; print(numpy.__version__)
1.11.2
Looking at the sample data in QGIS and zooming way in, I can see that zat_id 58 is in fact an island due to, presumably, digitization error. Unit 58 and unit 57 (the polygon to the north) do not have a shared edge or shared vertices, so the code is doing what it should do - identifying an island. Unfortunately, it looks like some topology checking and fixing is going to be required to get the results that you are expecting.
Very thanks. I'm sorry but I did not notice this :(
This is not entirely uncommon. I'm wondering if we should consider a "fuzzy" contiguity builder that could be used in these cases. Granted, it would no doubt be slow, but it might be an option a user could turn to when their shapefile was not planar enforced.
An option would be to have a utility function that calculates distances between borders? Not sure how this would work but might be useful to catch these cases. Maybe run a straight contiguity and, for any island, calculate distances between each edge and the closest one?
I would suggest to have a warning. I just realized that many weights matrices I calculated were wrong. I found this error by chance, just because that region was "super" disconnected :))
Would a rounding threshold work? By default, accept all the coordinates at the supplied precision, but allow the user to truncate this? The concern would be false positives due to rounding.
Let's say it would be better to have a warning when two regions are very
close (but how to define very?). This because I did not know to have the
problem before your answer :))
2016-12-14 16:14 GMT+01:00 jlaura notifications@github.com:
Would a rounding threshold work? By default, accept all the coordinates at
the supplied precision, but allow the user to truncate this? The concern
would be false positives due to rounding.—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://github.com/pysal/pysal/issues/889#issuecomment-267059566, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AALpT1JylQ-lumf87iPNEVi7H7y8bmpmks5rIAfMgaJpZM4LLmU-
.
--
Marco De Nadai
[email protected] me@marcodena.it
http://www.marcodena.it/
http://www.marcodena.it/?utm_source=email&utm_medium=email&utm_campaign=Email%2Bpersonali
@jlaura I find it does.
In my Julia implementation of a contiguity constructor, I use a significand, and it seems to work just fine.
There is definitely a hit on doing approximate array comparison on points or vectors in each check, but, overall, that Julia implementation was about as fast (stuff at or below NAT.shp
) as what we had before your high performance weights got merged.
it seems good:))
2016-12-14 21:46 GMT+01:00 Levi John Wolf notifications@github.com:
@jlaura https://github.com/jlaura I find it does.
In my Julia implementation of a contiguity constructor
https://github.com/ljwolf/SpatialWeights.jl/blob/master/src/weights.jl#L14,
I use a significand, and it seems to work just fine.There is definitely a hit on doing approximate array comparison on points
or vectors in each check, but, overall, that Julia implementation was about
as fast (stuff at or below NAT.shp) as what we had before your high
performance weights got merged.—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://github.com/pysal/pysal/issues/889#issuecomment-267151459, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AALpT7P7xA_RpJqlpTSwi3T3OOyb9yktks5rIFWUgaJpZM4LLmU-
.
--
Marco De Nadai
[email protected] me@marcodena.it
http://www.marcodena.it/
http://www.marcodena.it/?utm_source=email&utm_medium=email&utm_campaign=Email%2Bpersonali
@ljwolf Do you happen to have a test data set with known results given some significand? The addition of the parameter is easy, but I want to get it in with an appropriate test case.
Nope... would the source data for this issue work?
That was going to be my fallback - subset this dataset given the OPs list
of islands. Looks like the plan.
On Fri, Dec 16, 2016 at 4:14 PM, Levi John Wolf notifications@github.com
wrote:
Nope... would the source data for this issue work?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/pysal/pysal/issues/889#issuecomment-267719015, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACAgfPNMSeamYJmVLmqnshvUDt-p8sycks5rIxtfgaJpZM4LLmU-
.
Hi guys,
Unfortunately, this is happening with me when I try to create the queen contiguity matrix using the shapefile attached. When I use the ps.queen_from_shapefile function it warns me that it has 30 islands in the shapefile, which is not the case.
When I use the same shapefile using the poly2nb function in R, it does not give me any island. The following screenshots ilustrate this below:
Is there a way to manage this behavior?
Thanks a lot!
Best,
Renan
As @jlaura commented above, this is likely due to the shapefile having digitization issues so that the islands PySAL identifies are polygons not sharing edges or vertices with any other polygon. Visually it may appear that there are no islands, but from a planar enforcement perspective, there are.
One suggestion is to export your neighbors list from R and then read it in PySAL if you are wanted to process further with PySAL.
Alright, I believe this is precisely the problem with my shapefile! Thanks for the suggestion!
There is a libpysal PR that supports this approach:
Yes! I believe this solves the digitization issue!
hello every one @sjsrey @darribas @ljwolf , i have an issue with weight matrix W in pysal package , im working on a data point with their lat and long , so i create my dictionnary neighbors and weights as dict
neighbors={0: [((x1,y1),(x1, y1)),((x1,y1),(x2,y2)),,,,),1:((.....)))]}
and the associate wights matrix with the same keys and a real distance .
weights={0: OrderedDict([(0, 0.0),
(1, 429.699134271888),
(2, 1373.9114796763188),
(3, 974.1306507848191),
(4, 620.9800905163695),
1:(()) .....}
so the problem is when i use them in W it gives me the next error
w=W(neighbors,weights)
`error:KeyError Traceback (most recent call last)
----> 1 w=W(neighbors,weights)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pysal\lib\weights\weights.py in __init__(self, neighbors, weights, id_order, silence_warnings, ids)
167 warnings.warn("There are %d disconnected observations" % ni + ' \n '
168 " Island ids: %s" % ', '.join(str(island) for island in self.islands))
--> 169 if self.n_components > 1 and not self.islands and not self.silent_connected_components:
170 warnings.warn("The weights matrix is not fully connected. There are %d components" % self.n_components)
171
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pysal\lib\weights\weights.py in n_components(self)
338 """
339 if 'n_components' not in self._cache:
--> 340 self._n_components, self._component_labels = connected_components(self.sparse)
341 self._cache['n_components'] = self._n_components
342 self._cache['component_labels'] = self._component_labels
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pysal\lib\weights\weights.py in sparse(self)
329 """
330 if 'sparse' not in self._cache:
--> 331 self._sparse = self._build_sparse()
332 self._cache['sparse'] = self._sparse
333 return self._sparse
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pysal\lib\weights\weights.py in _build_sparse(self)
362 data = []
363 id2i = self.id2i
--> 364 for i, neigh_list in list(self.neighbor_offsets.items()):
365 card = self.cardinalities[i]
366 row.extend([id2i[i]] * card)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pysal\lib\weights\weights.py in neighbor_offsets(self)
859 id2i = self.id2i
860 for j, neigh_list in list(self.neighbors.items()):
--> 861 self.__neighbors_0[j] = [id2i[neigh] for neigh in neigh_list]
862 self._cache['neighbors_0'] = self.__neighbors_0
863 return self.__neighbors_0
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pysal\lib\weights\weights.py in
859 id2i = self.id2i
860 for j, neigh_list in list(self.neighbors.items()):
--> 861 self.__neighbors_0[j] = [id2i[neigh] for neigh in neigh_list]
862 self._cache['neighbors_0'] = self.__neighbors_0
863 return self.__neighbors_0
KeyError: ((6658394.96, 625420.73), (6658394.96, 625420.73))
`
this is what looks like @joshk @sjsrey @denadai2
and the eroor
Hey @Morenagl, thanks for the question! Your dictionary is not in the right format. Also, this issue is not related to the concerns you've posted. If you still have issues after this comment, let's open another issue, or check in with developers on the gitter channel or thespatialcommunity.slack.com #python channel
The PySAL W
object is built with the key of the dict as the "name" or "index" of the point, and the value being the list of neighbors (or list of weights) that correspond to that index. So, to encode a line of three observations:
from pysal.lib.weights import W
neighbors = {'a':['b'], 'b':['a','c'], 'c':['b']}
weights = {'a':[1], 'b':[1,1], 'c':[1]}
my_w = W(neighbors, weights)
Then, this should look like you'd expect for an adjacency matrix of three observations in a line:
my_w.full()[0]
array([[0., 1., 0.],
[1., 0., 1.],
[0., 1., 0.]])
In your case, you seem to want to use (x0,y0)
as the "name" for point 0
, rather than just calling it 0
. So, if you use the tuple form, this might look something like:
neighbors = {(x0,y0):[(x1,y1), (x2,y2)],
(x1,y1):[(x0,y0), (x2,y2)],
(x2,y2):[(x0,y0), (x1,y1)]}
weights = {(x0,y0):[d01, d02],
(x1,y1):[d10, d12],
(x2,y2):[d20, d21]}
Note that the keys are always the "names." In your example, you mix between names, using both 0
and (x0,y0)
.
But, if you've already got the distance matrix formed (say, from using scipy.spatial.distance.pdist
and scipy.spatial.distance.squareform
, or sklearn.metrics.pairwise
...), you can use full2W
. As an example:
from pysal.lib.weights.util import full2W
from scipy.spatial import distance
import numpy
random_data = numpy.random.normal(size=(5,2))
distance_matrix = distance.squareform(distance.pdist(random_data))
my_w = full2W(distance_matrix)
@ljwolf thanks you for all that i'll check all what you sad , i'll try the recreate the right format
i have just a question that function calculate wich distance because i'm focusing on he real one so i can just change dostance.pdist with mine and i ll have the squareform