• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Taxonomic applicability discrepancy between XML and UI
#1
I am working with your latest released (https://aopwiki.org/downloads/aop-wiki-xml-2022-01-01.gz) dump and I've noticed missing or different information in it when compared to your website.

As an example, the AOP nº 69(https://aopwiki.org/aops/69#taxonomic_applicability) lists to be applicable on rats and humans.
On the XML, this AOP ID only has a taxonomic reference:

"<taxonomy-reference id="3cf5bab0-bfa0-48db-9347-ebd8fdbaea9e" aop-wiki-id="69"/>"

pointing to rats:
```
<taxonomy id="3cf5bab0-bfa0-48db-9347-ebd8fdbaea9e">
    <source-id>10116</source-id>
    <source>NCBI</source>
    <name>rats</name>
```

Another example is AOP nº 202(https://aopwiki.org/aops/202#taxonomic_applicability)for which there is no taxonomic reference at all in the XML whereas the webpage says it has humans applicability.

Could you please tell me if I am processing the data in an incorrect way or what is the source of this difference?


Many thanks for your help.
Best,
Irene López
  Reply
#2
Hello Irene - I don't believe the problem is on your end. I think that is an issue with the current XML. I will forward your message to a couple of folks on the technical team that may have a better explanation for you.
  Reply
#3
Hi Irene,

Sorry for the delayed reply. It looks like we have two issues here. First, we have an error with our XML, and I will post here again as soon as we get the fix pushed out to our production server. The second issue is that you are misinterpreting the identifiers at the bottom of the XML page.

This line
"<taxonomy-reference id="3cf5bab0-bfa0-48db-9347-ebd8fdbaea9e" aop-wiki-id="69"/>"
indicates that the identifier "3cf5bab0-bfa0-48db-9347-ebd8fdbaea9e" in the XML file corresponds to an internal identifier in the wiki itself of "69" not that it is assigned to AOP #69. Since we don't expose the identifiers for the taxonomy terms, it is difficult to illustrate, but I can point you to the corresponding row for AOP-69:
<aop-reference id="321e260a-892a-4c80-8e5a-a2fb31bba3c1" aop-wiki-id="69"/>.

The second line above means that the identifier "321e260a-892a-4c80-8e5a-a2fb31bba3c1" in the XML corresponds to AOP-69 (https://aopwiki.org/aops/69). On line 186,041 in the January XML file, you find this line:
<aop id="321e260a-892a-4c80-8e5a-a2fb31bba3c1">. That marks the beginning of the XML record for AOP-69. Within that XML block, the taxonomy IDs should be included, but that is an error on our side. As soon as that error is fixed, I'll provide more details here.
  Reply
#4
The fix for this issue is scheduled to be deployed next weekend (May 7, 2022). If you have any questions after the update, please post them here.
  Reply
#5
Hi Irene, The issue with the missing taxonomy terms has now been fixed. I've provided a crosswalk for you below to further clarify my earlier comments about how to map the entities from the XML to the entries in the wiki.

AOP 69 now has the following section summarizing the applicability information
<applicability>
<sex>
<evidence>High</evidence>
<sex>Male</sex>
</sex>
<life-stage>
<evidence>High</evidence>
<life-stage>Adult, reproductively mature</life-stage>
</life-stage>
<taxonomy taxonomy-id="03a2d35e-5ce7-4275-aabf-65bc0c81f54a">
<evidence>High</evidence>
</taxonomy>
<taxonomy taxonomy-id="9f3880fe-e513-45a8-8050-fbd010dcbab9">
<evidence>High</evidence>
</taxonomy>
</applicability>

You will see that it now includes two <taxonomy> entries that weren't there before. The identifiers for these entries correspond to the following records that are defined earlier in the XML data structure:
<taxonomy id="03a2d35e-5ce7-4275-aabf-65bc0c81f54a">
<source-id>10116</source-id>
<source>NCBI</source>
<name>Rattus norvegicus</name>
</taxonomy>
<taxonomy id="9f3880fe-e513-45a8-8050-fbd010dcbab9">
<source-id>9606</source-id>
<source>NCBI</source>
<name>Homo sapiens</name>
</taxonomy>

There are also two references at the end of the XML:
<taxonomy-reference id="03a2d35e-5ce7-4275-aabf-65bc0c81f54a" aop-wiki-id="66"/>
<taxonomy-reference id="9f3880fe-e513-45a8-8050-fbd010dcbab9" aop-wiki-id="1"/>

So as you can see, after our most recent update, both the taxonomy domains are consistent between the web interface and the XML. This change is immediate when accessing the dynamic XML for individual AOPs. The examples above are derived from here: https://aopwiki.org/aops/69.xml. The nightly download file (https://aopwiki.org/downloads/aop-wiki-xml.gz) will be correct as of tomorrow (5/8/2022). It will also be corrected in our next quarterly release on July 1, 2022. We do not correct errors in previous permanent downloads.

As noted before, the aop-wiki-id for taxonomy terms is not that useful because we don't expose those identifiers. For AOPs, key events, relationships, and stressors, however, these IDs will reference the specific pages in the wiki. For example, AOP 69 has the following identifier in this version of the XML:
<aop id="f69369fd-e613-4f53-a019-06eb7df29efb">
<aop-reference id="f69369fd-e613-4f53-a019-06eb7df29efb" aop-wiki-id="69"/>

Note that the identifiers used within the XML are regenerated each time the XML is created, so those identifiers cannot be used to compare like entities between one XML file and the next. For that, you will need to use the vendor IDs at the bottom of the XML file. These will remain constant from one download file to the next since they correspond to our internal identifiers in the wiki itself.

Let me know if you have any further questions or notice any additional discrepancies.
  Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)