Aws-sdk-java: DynamoDBMapper can store List<CustomObject> but not Set<CustomObject>

Created on 23 Dec 2014  Â·  21Comments  Â·  Source: aws/aws-sdk-java

Why is it that DynamoDBMapper can marshal a List but not a Set when a Set could easily be stored in Dynamo without loss of precision using a List?

Please see the below example which works perfectly until you replace List with Set, however when you do this you get the following exception.

com.amazonaws.services.dynamodbv2.datamodeling.marshallers.ObjectSetToStringSetMarshaller: Marshaling a set of non-String objects to a DynamoDB StringSet. You won't be able to read these objects back out of DynamoDB unless you REALLY know what you're doing: it's probably a bug. If you DO know what you're doing feelfree to ignore this warning, but consider using a custom marshaler for this instead.

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration("/application-context.xml")
public class ProductIntegrationTest {
    @Autowired
    AmazonDynamoDBClient dynamoDBClient;

    @Test
    public void shouldStoreNewProduct() {

        DynamoDBMapper productMapper = new DynamoDBMapper(dynamoDBClient);

        User user1 = new User();
        user1.setUserId(1);
        user1.setFirstName("user1");

        User user2 = new User();
        user2.setUserId(2);
        user2.setFirstName("user2");

        List<User> users = new ArrayList<>();
        users.add(user1);
        users.add(user2);

        Product p = new Product();
        p.setProductName("productA");
        p.setUsers(users);

        productMapper.save(p);
    }
}

Product

@DynamoDBTable(tableName = "products")
public class Product {
    private String productName;
    private List<User> users;

    public Product() {

    }

    @DynamoDBHashKey(attributeName = "productName")
    public String getProductName() {
        return productName;
    }

    public void setProductName(String productName) {
        this.productName = productName;
    }

    @DynamoDBAttribute(attributeName = "users")
    public List<User> getUsers() {
        return users;
    }

    public void setUsers(List<User> users) {
        this.users = users;
    }

}

User

@DynamoDBDocument
public class User {
    private String firstName;
    private long userId;

    public User(){

    }

    @DynamoDBAttribute(attributeName = "firstName")
    public String getFirstName() {
        return firstName;
    }

    public void setFirstName(String firstName) {
        this.firstName = firstName;
    }

    @DynamoDBAttribute(attributeName = "userId")
    public long getUserId() {
        return userId;
    }

    public void setUserId(long userId) {
        this.userId = userId;
    }
}

Most helpful comment

As we discussed above, we kept the original (admittedly pretty strange) behavior of the mapper by default to avoid breaking backwards-compatibility. To get the new behavior, you need to explicitly configure the mapper to use the new V2 conversion schema.

new DynamoDBMapper(dynamoDBClient, new DynamoDBMapperConfig(ConversionSchemas.V2));

We should get that WARN message updated to mention the V2 schema (and document it better in general).

All 21 comments

This (commit https://github.com/ccoffey/aws-sdk-java/commit/5a53e8c7620f6af55952e39d0abffca4292845e0) adds support for marshalling Sets of arbitrary objects.

Is there a battery of unit tests I can execute to prove I did no harm?
Whats the process for submitting this commit for code-review and hopefully inclusion into the project?

Added a second (commit https://github.com/ccoffey/aws-sdk-java/commit/820d9124f85edd18660e950e714b2b9658207010) which cleans up the Set marshalling a little. I wasn't happy with the try, fail try-something else logic.

Hey Cathal,

Once upon a time, in the dark ages before DynamoDB supported List shapes, the mapper only supported Sets of a few types that we could reliably round-trip through DynamoDB's StringSet, NumberSet, and BinarySet types. Due to a quirk in the original implementation however, the mapper would actually successfully _save_ Sets of arbitrary types into DynamoDB; they took the Set<String> codepath, were converted into Strings via toString(), and these strings were stored in a DynamoDB StringSet.

When we retooled the mapper to add support for Lists/Maps, we decided it was important to keep this behavior backwards-compatible by default in case anyone was depending on it. Hence ObjectSetToStringSetMarshaller, which explicitly performs this same conversion but logs a warning if it's ever used.

If you explicitly opt in to the V2 conversion schema, we no longer register the ObjectSetToStringSetMarshaller, so we could definitely look at adding support for a new ObjectSetToListMarshaller and corresponding ListToObjectSetUnmarshaller there. Does that sound reasonable to you?

Hey David,

thank you for the detailed explanation and yes this sounds incredibly
reasonable.

Are these new ObjectSetToListMarshaller and corresponding
ListToObjectSetUnmarshaller something your interesting in looking into
soon? If not then I would like to work on them and submit a pull request.
Any advise in this regard?

Kind regards,
Cathal

I wrote them up yesterday. :)

I want to double-check this with a couple other engineers on the team who are out on vacation this week, but assuming they don't see any red flags we should be able to get this into a release next week-ish. I'll see if I can get the change ported out from our internal build system to a GitHub pull request so you can play around with it in the meantime.

David,

thats awesome! I'm really looking forward to see the pull request.

Happy new year and thanks for making this happen,
Cathal

When loading items from DynamoDB, there's a possibility that multiple items in the list end up being equivalent according to the Java Set comparator - either because the comparator has changed over time, or because someone is writing items into the list using the raw DynamoDB API.

We're thinking it's worth checking this when loading items and throwing a DynamoDBMappingException if duplicates are found, rather than silently discarding some of the items from the list (and then clobbering the "duplicate" values if you write the item back to DynamoDB). Does that make sense?

This makes a lot of sense to me, I think it's the best approach.

On Mon, 5 Jan 2015 21:37 David Murray [email protected] wrote:

When loading items from DynamoDB, there's a possibility that multiple
items in the list end up being equivalent according to the Java Set
comparator - either because the comparator has changed over time, or
because someone is writing items into the list using the raw DynamoDB API.

We're thinking it's worth checking this when loading items and throwing a
DynamoDBMappingException if duplicates are found, rather than silently
discarding some of the items from the list (and then clobbering the
"duplicate" values if you write the item back to DynamoDB). Does that make
sense?

—
Reply to this email directly or view it on GitHub
https://github.com/aws/aws-sdk-java/issues/331#issuecomment-68786268.

Support for this was added in 1.9.15, thanks for the suggestion!

I just tried to run this exact example with aws-java-sdk-1.9.26 and I a warning

WARNING: Marshaling a set of non-String objects to a DynamoDB StringSet. You won't be able to read these objects back out of DynamoDB unless you REALLY know what you're doing: it's probably a bug. If you DO know what you're doing feelfree to ignore this warning, but consider using a custom marshaler for this instead.

Then when I check my DynamoDB table I see

 {
  "productName": {
    "S": "productA"
  },
  "users": {
    "SS": [
      "com.amazon.test.User@3a7b503d",
      "com.amazon.test.User@512d92b"
    ]
  }
}

I thought support for this was added in 1.9.15? Below is exactly what I tried to store.

    DynamoDBMapper productMapper = new DynamoDBMapper(dynamoDBClient);

    User user1 = new User();
    user1.setUserId(1);
    user1.setFirstName("user1");

    User user2 = new User();
    user2.setUserId(2);
    user2.setFirstName("user2");

    Set<User> users = new HashSet<>();
    users.add(user1);
    users.add(user2);

    Product p = new Product();
    p.setProductName("productA");
    p.setUsers(users);

    productMapper.save(p); 

As we discussed above, we kept the original (admittedly pretty strange) behavior of the mapper by default to avoid breaking backwards-compatibility. To get the new behavior, you need to explicitly configure the mapper to use the new V2 conversion schema.

new DynamoDBMapper(dynamoDBClient, new DynamoDBMapperConfig(ConversionSchemas.V2));

We should get that WARN message updated to mention the V2 schema (and document it better in general).

Thank you, setting the ConversionSchema worked perfectly.

Now, one final strange piece of behavior is that SaveBehavior.APPEND_SET does not work with these sets.

After running the original example and then running the below, I expected to see a single row corresponding to Product A with Users 1, 2, 3, 4 stored in Dynamo.

    User user3 = new User();
    user3.setUserId(3);
    user3.setFirstName("user3");

    User user4 = new User();
    user4.setUserId(4);
    user4.setFirstName("user4");

    Set<User> users = new HashSet<User>();
    users.add(user3);
    users.add(user4);

    Product p = new Product();
    p.setProductName("productA");
    p.setUsers(users);

    productMapper.save(p);

Ah, yeah. Sets of objects are stored as lists in DynamoDB, not sets (DDB only supports scalar sets). We're currently still using the old AttributeUpdates parameter for doing updates, which doesn't support a way to append to a list. You'll need to do a read-update-(conditional)-write to append to these "sets".

We could theoretically support something like APPEND_SET for list-y sets like this if we upgraded to using the newer UpdateExpression parameter (which has a list_append function). It wouldn't technically be set-append though, since the server-side won't enforce uniqueness within a list. If you append an object which - from your comparator's perspective - is already in the set, you'll end up with two copies stored server-side and (as per our discussion above) you'll get a DynamoDBMappingException if you try to read this "set" back out.

Given that caveat, would this still be a worthwhile feature for you?

Even with the cavet, this would be a useful feature, It has mean wondering
now why is there no APPEND_LIST in the first place? The type of APPEND_SET
functionality you have described is basically APPEND_LIST so why not just
support this instead and make it available to both list and list-y sets?
This would be a great feature.

On Thu, 26 Mar 2015 16:42 David Murray [email protected] wrote:

Ah, yeah. Sets of objects are stored as lists in DynamoDB, not sets (DDB
only supports scalar sets). We're currently still using the old
AttributeUpdates parameter for doing updates, which doesn't support a way
to append to a list. You'll need to do a read-update-(conditional)-write to
append to these "sets".

We could theoretically support something like APPEND_SET for list-y sets
like this if we upgraded to using the newer UpdateExpression parameter
(which has a list_append function). It wouldn't technically be set-append
though, since the server-side won't enforce uniqueness within a list. If
you append an object which - from your comparator's perspective - is
already in the set, you'll end up with two copies stored server-side and
(as per our discussion above) you'll get a DynamoDBMappingException if you
try to read this "set" back out.

Given that caveat, would this still be a worthwhile feature for you?

—
Reply to this email directly or view it on GitHub
https://github.com/aws/aws-sdk-java/issues/331#issuecomment-86612466.

Sounds reasonable. Supporting APPEND_LIST would take a bit of a rewrite to switch to use UpdateExpressions - there's no way to append to a list using AttributeUpdates, and you can't mix and match the two. Want to go ahead and open a new issue to track that? Or if you're feeling spunky and want to send a pull request to get the ball rolling I wouldn't complain. :)

Tnks! Master!
it works!

@david-at-aws It would be great if the DDB documentation could be updated with this. The java sdk supported data types in the docs don't have any information about Map and List support. http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBMapper.DataTypes.html

@david-at-aws Thanks for your answer. new DynamoDBMapperConfig(ConversionSchemas.V2) is deprecated now in version [1.11,2.0). And what should I write it now ?

@david-at-aws .
Hi, I have found the recommended style to do this. That is
DynamoDBMapperConfig config = DynamoDBMapperConfig.builder().withConversionSchema(ConversionSchemas.V2).build();.

Is it right ?

I still can't get my set of complex objects, Set<Contact> to persist properly. Contact is a DynamoDBDocument. I'm using the v2 ConversionSchema as xingtanzjr demonstrated above, but the dynamodb entry looks like it's persisting Contact as a String, not a Mapped Document. If I convert it to a List, the data looks as expected. But as a Set, it looks like below, and unmarshalls to a LinkedHashSet, which is not what I want:

         "Contacts": {
                "SS": [
                    "Harry Vendor,[email protected],1,571,5559090", 
                    "Joe Schmo,[email protected],1,443,5553434"
                ]
            }

I can confirm the issue.

Neither

new DynamoDBMapper(dynamoDB,DynamoDBMapperConfig.builder().withConversionSchema(ConversionSchemas.V2).build());

nor

new DynamoDBMapper(dynamoDB, new DynamoDBMapperConfig(ConversionSchemas.V2));

works as expected. DynamoDBMapper still invokes toString() for objects in Set making it become a StringSet.

Was this page helpful?
0 / 5 - 0 ratings