0

I removed a PropertyKey from the schema that was already linked to some nodes. I get a NullPointerException every time I try and scan over properties of these nodes - elementMap(), valueMap() and drop() fail, but accessing like g.V(1).properties('goodProp', 'badProp') will return only the good property, not causing the exception.

This property was not part of any index (thank goodness). Re-adding the property does not work, and trying to drop() the affected vertices fails.

I suspect this is related to https://github.com/JanusGraph/janusgraph/issues/1812 and have commented on the issue, but asking here as well.

Janusgraph: v1.0.0 Cassandra: v3.11 and v4.1

I'm not looking forward to recreating this data, however part of that cleanup involves dropping the old nodes, or writing new data where the old stuff should be, and I can't do either. I can add a new badProperty to the node, and get it with .properties('badProperty'), however valueMap() etc. is still broken.

Is there a job that can clean up "zombie properties"? Or perhaps seeing as Janusgraph will let you do something like this, at least handle it so recovery is possible without having to rebuild the whole graph somehow?

I've got a reproduction here, and as above only really rears its head once the cache has been cleared. Ignore the fact the label is 'tweet', it's not indicative of the real graph, but I needed to call it something. I've also simplified as much as I can here.

// first, set up the schema

mgmt=graph.openManagement()
tweet = mgmt.makeVertexLabel('tweet').make()
t_id = mgmt.makePropertyKey('t_id').dataType(String.class).cardinality(SINGLE).make()
t_msg = mgmt.makePropertyKey('t_msg').dataType(String.class).cardinality(SINGLE).make()
// make sure to link the properties to the node
mgmt.addProperties(tweet, t_id, t_msg)
mgmt.commit()

// add some data
g.addV('tweet').property('t_id', '1').property('t_msg', 'msg 1').iterate()

// now, break the schema
mgmt=graph.openManagement()
mgmt.getPropertyKey('t_id').remove();
mgmt.commit()

Important: restart the server here and load a new console to remove all caching

// these all fail
g.V().hasLabel('tweet').valueMap()
g.V().hasLabel('tweet').elementMap()
g.V().hasLabel('tweet').drop()

// this works, just one property shown though
g.V().hasLabel('tweet').properties('t_msg', 't_id')

Here is the failed stack trace:

gremlin> g.V(4240).valueMap()
java.lang.NullPointerException
Type ':help' or ':h' for help.
Display stack trace? [yN]y
java.lang.NullPointerException
    at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:137)
    at org.janusgraph.graphdb.database.EdgeSerializer.readRelation(EdgeSerializer.java:83)
    at org.janusgraph.graphdb.transaction.RelationConstructor.readRelation(RelationConstructor.java:70)
    at org.janusgraph.graphdb.transaction.RelationConstructor$1.next(RelationConstructor.java:57)
    at org.janusgraph.graphdb.transaction.RelationConstructor$1.next(RelationConstructor.java:45)
    at org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertyMapStep.addElementPropertiesInternal(JanusGraphPropertyMapStep.java:116)
    at org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertyMapStep.map(JanusGraphPropertyMapStep.java:104)
    at org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertyMapStep.map(JanusGraphPropertyMapStep.java:46)
    at org.apache.tinkerpop.gremlin.process.traversal.step.map.ScalarMapStep.processNextStart(ScalarMapStep.java:40)
    at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:155)
    at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
    at org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.handleIterator(AbstractOpProcessor.java:98)
    at org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor.lambda$evalOpInternal$6(AbstractEvalOpProcessor.java:267)
    at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:283)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

And now the attempted recovery:

mgmt=graph.openManagement()
tweet = mgmt.getVertexLabel('tweet')
t_msg = mgmt.getPropertyKey('t_msg')
// recreate the property
t_id = mgmt.makePropertyKey('t_id').dataType(String.class).cardinality(SINGLE).make()
mgmt.addProperties(tweet, t_id, t_msg)
mgmt.commit()

// do a restart if you like

// succeeds
g.V().hasLabel('tweet').property('t_id', '1').iterate()
g.tx().commit()

// shows both properties
g.V().hasLabel('tweet').properties('t_msg', 't_id')

// all still fail with NPE
g.V().hasLabel('tweet').valueMap()
g.V().hasLabel('tweet').elementMap()
g.V().hasLabel('tweet').drop()

// I can drop the new property, but then dropping the node fails
g.V().hasLabel('tweet').properties('t_id').drop() // ok
g.V().hasLabel('tweet').valueMap() // NPE again

So, by removing a schema property that's in use on nodes, you can pretty much brick your graph, and have no way to add the property back to it, or to remove any bad nodes? Is there any way to recover here? I can, through a convoluted process, rebuild individual nodes and sort the edges etc. out, but I do really need to be able to clear these old ones at least. Surely there has to be a way out here?

0

Browse other questions tagged or ask your own question.