Saturday, November 1, 2014

Alfresco: No live node exists - error halts solr indexing





Error: 

Using 4.0.d on postgresql indexing with Solr, I see this error in alfresco.log
2012-02-15 08:11:50,393  ERROR [extensions.webscripts.AbstractRuntime] [http-8443-4] Exception from executeScript - redirecting to status template error: 01156612 Wrapped Exception (with status template): No live node exists: 

   ID:        520521

   Cache row: NodeEntity[ ID=520521, version=16, store=workspace://SpacesStore, uuid=64b01e12-dfac-4b22-96d0-bfa30ba7d34e, typeQNameId=32, localeId=15, aclId=null, deleted=true, transaction=TransactionEntity[ ID=1064115, server=null, changeTxnId=f4359d12-f144-43c7-8c17-18697c4eb864, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=anbj01, auditCreated=2012-02-07T08:05:37.520+01:00, auditModifier=anbj01, auditModified=2012-02-07T09:20:18.000+01:00]]

   DB row:    NodeEntity[ ID=520521, version=16, store=workspace://SpacesStore, uuid=64b01e12-dfac-4b22-96d0-bfa30ba7d34e, typeQNameId=32, localeId=15, aclId=null, deleted=true, transaction=TransactionEntity[ ID=1064115, server=null, changeTxnId=f4359d12-f144-43c7-8c17-18697c4eb864, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=anbj01, auditCreated=2012-02-07T08:05:37.520+01:00, auditModifier=anbj01, auditModified=2012-02-07T09:20:18.000+01:00]]

 org.springframework.extensions.webscripts.WebScriptException: 01156612 Wrapped Exception (with status template): No live node exists: 

   ID:        520521

   Cache row: NodeEntity[ ID=520521, version=16, store=workspace://SpacesStore, uuid=64b01e12-dfac-4b22-96d0-bfa30ba7d34e, typeQNameId=32, localeId=15, aclId=null, deleted=true, transaction=TransactionEntity[ ID=1064115, server=null, changeTxnId=f4359d12-f144-43c7-8c17-18697c4eb864, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=anbj01, auditCreated=2012-02-07T08:05:37.520+01:00, auditModifier=anbj01, auditModified=2012-02-07T09:20:18.000+01:00]]

   DB row:    NodeEntity[ ID=520521, version=16, store=workspace://SpacesStore, uuid=64b01e12-dfac-4b22-96d0-bfa30ba7d34e, typeQNameId=32, localeId=15, aclId=null, deleted=true, transaction=TransactionEntity[ ID=1064115, server=null, changeTxnId=f4359d12-f144-43c7-8c17-18697c4eb864, commitTimeMs=null], auditProps=AuditablePropertiesEntity[ auditCreator=anbj01, auditCreated=2012-02-07T08:05:37.520+01:00, auditModifier=anbj01, auditModified=2012-02-07T09:20:18.000+01:00]]

This error halts Solr indexing, it cannot get past that. I think Solr should be more fault tolerant, but my primary question here is how can this be fixed?


Solution 1: This will work for all versions of ICP.  If you are using alfresco 4.2+ Solution 2 is the recommended.

I tried this on 4.0.e.

We need to apply it directly on alfresco database.

This was not a cache issue, I had to remove the "no live" node directly from the database.
Solr is definitely exposing underlying issues on the database, I'm not sure if it would be best if Solr (like old lucene) moved past errors, so that indexing can continue, or halt like in this case. It went unnoticed for several days, so there was a very outdated index. But if indexing had moved on, I'm not sure an error in alfresco.log would have been spotted and subsequently fixed.
This is how I fixed it:
I ran these select statements to find out where the node id 520521 is present
select * from alf_child_assoc where child_node_id = 520521;

select * from alf_node_assoc where target_node_id = 520521;

select * from alf_node_assoc where source_node_id = 520521;

select * from alf_node_aspects where node_id = 520521;

select * from alf_node_properties where node_id = 520521;

select * from alf_node where id = 520521;

From there I could see that the node had no properties, it was not a child node to any node, and didn't have any associations.
It did however have a childnode (webpreview). This to me are some remains from a transaction that has gone very wrong, and the post in alf_node can/must be deleted.
So I deleted the rows
delete from alf_node_assoc where source_node_id = 520521;

delete from alf_node where id = 520521

Then for the now orphan webpreview childnode
 update alf_node set NODE_DELETED = true where id = 530544;

From there the indexing kicked in an now works.

*** If you fail to delete the node from alf_node. We need to delete if from other tables which has the foriegn key for alf_node table.


Solution 2:

This may not work for lower versions of alfresco 4.2

Following some steps from the wiki, I ran the SOLR FIX action. As I hadn't used any of the SOLR urls before I first had to set up the certificate on my browser. I copied browser.p12 from (my directory structure, YMMV) /opt/alfresco-4.0.d/tomcat/webapps/alfresco/WEB-INF/classes/keystore/browser.p12 on the server to my desktop, then imported it into Firefox (better instructions on the wiki). I then navigated to https://our.alfresco.url:8443/solr/admin/cores?action=FIX in Firefox, accepted the untrusted certificate exception and waited a few minutes for the page to load. Once it had loaded it displayed:
<response><lst name="responseHeader"><int name="status">0</int><intname="QTime">48942</int></lst></response>
and the error messages in catalina.out stopped.




















source : https://forums.alfresco.com/forum/developer-discussions/repository-services/no-live-node-exists-error-halts-solr-indexing