Skip to content

Add MaDs for Apache Avro#21754

Draft
jacknojo wants to merge 2 commits into
mainfrom
jacknojo/add_llm_generated_mads_for_avro
Draft

Add MaDs for Apache Avro#21754
jacknojo wants to merge 2 commits into
mainfrom
jacknojo/add_llm_generated_mads_for_avro

Conversation

@jacknojo
Copy link
Copy Markdown
Contributor

@jacknojo jacknojo commented Apr 24, 2026

Add Models as Data for the Java version of Apache Avro. This is based on this subfolder/commit.

This is entirely LLM-generated and the output has undergone a voting procedure. It is not meant to fully cover the library. I am curious for any feedback on this. We also need to decide if the provenance is OK or if there is a better name.

Copilot AI review requested due to automatic review settings April 24, 2026 11:37
@jacknojo jacknojo requested a review from a team as a code owner April 24, 2026 11:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

@github-actions github-actions Bot added the Java label Apr 24, 2026
@jacknojo jacknojo requested a review from michaelnebel April 24, 2026 11:38
@jacknojo
Copy link
Copy Markdown
Contributor Author

I am not asking for a full review necessarily, this is machine-generated and I have looked at the rows myself individually. Obviously my CodeQL knowledge is still limited so I may very well have made mistakes.

I am open for discussing how to approach getting this reviewed/merged.

@jacknojo jacknojo removed the request for review from michaelnebel April 27, 2026 13:24
@jacknojo jacknojo marked this pull request as draft April 27, 2026 13:24
@jacknojo jacknojo changed the base branch from main to jacknojo/move_java_generated_mads April 27, 2026 13:42
Copy link
Copy Markdown
Contributor

@owen-mc owen-mc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the output format needs to be updated to correctly deal with nested classes.

Comment thread java/ql/lib/ext/generated/llmgenerator/org.apache.avro.data.model.yml Outdated
Comment thread java/ql/lib/ext/generated/llmgenerator/org.apache.avro.data.model.yml Outdated
@jacknojo jacknojo force-pushed the jacknojo/add_llm_generated_mads_for_avro branch from 5d78705 to ec10873 Compare April 28, 2026 11:24
@jacknojo jacknojo changed the base branch from jacknojo/move_java_generated_mads to main April 28, 2026 11:24
@owen-mc
Copy link
Copy Markdown
Contributor

owen-mc commented Apr 28, 2026

In the past, when we have generated too many models to manually check, we have validated them by running QA and checking the alert changes are reasonable (not too many, not too many FPs). It turns out that if you have 100 incorrect models, most of them won't cause any problems, because they just don't lead to any extra data flow paths, but there are some which do cause a lot of extra data flow paths, which will cause lots of FPs, which are pretty obvious when you look at the QA results.

This PR doesn't introduce that many models, so it might be worth reviewing them manually. Also, the above procedure won't work for sources that belong to non-default threat models. It might well be possible to run QA with a particular threat model. I haven't done this before so I've asked around. If so we should definitely make sure to add file to the default remote threat model, so that those models are exercised.

@jacknojo jacknojo force-pushed the jacknojo/add_llm_generated_mads_for_avro branch from ec10873 to bb270db Compare May 5, 2026 08:41
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged.

Click to show differences in coverage

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Others,"``actions.osgi``, ``antlr``, ``ch.ethz.ssh2``, ``cn.hutool.core.codec``, ``com.alibaba.com.caucho.hessian.io``, ``com.alibaba.druid.sql``, ``com.alibaba.fastjson2``, ``com.amazonaws.auth``, ``com.auth0.jwt.algorithms``, ``com.azure.identity``, ``com.caucho.burlap.io``, ``com.caucho.hessian.io``, ``com.cedarsoftware.util.io``, ``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.esotericsoftware.yamlbeans``, ``com.hubspot.jinjava``, ``com.jcraft.jsch``, ``com.microsoft.sqlserver.jdbc``, ``com.mitchellbosecke.pebble``, ``com.opensymphony.xwork2``, ``com.sshtools.j2ssh.authentication``, ``com.sun.crypto.provider``, ``com.sun.jndi.ldap``, ``com.sun.net.httpserver``, ``com.sun.net.ssl``, ``com.sun.rowset``, ``com.sun.security.auth.module``, ``com.sun.security.ntlm``, ``com.sun.security.sasl.digest``, ``com.thoughtworks.xstream``, ``com.trilead.ssh2``, ``com.unboundid.ldap.sdk``, ``com.zaxxer.hikari``, ``flexjson``, ``hudson``, ``io.jsonwebtoken``, ``io.undertow.server.handlers.resource``, ``javafx.scene.web``, ``jenkins``, ``jodd.json``, ``liquibase.database.jvm``, ``liquibase.statement.core``, ``net.lingala.zip4j``, ``net.schmizz.sshj``, ``net.sf.json``, ``net.sf.saxon.s9api``, ``ognl``, ``org.acegisecurity``, ``org.antlr.runtime``, ``org.apache.commons.codec``, ``org.apache.commons.compress.archivers.tar``, ``org.apache.commons.exec``, ``org.apache.commons.fileupload``, ``org.apache.commons.httpclient.util``, ``org.apache.commons.jelly``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.commons.lang``, ``org.apache.commons.logging``, ``org.apache.commons.net``, ``org.apache.commons.ognl``, ``org.apache.cxf.catalog``, ``org.apache.cxf.common.classloader``, ``org.apache.cxf.common.jaxb``, ``org.apache.cxf.common.logging``, ``org.apache.cxf.configuration.jsse``, ``org.apache.cxf.helpers``, ``org.apache.cxf.resource``, ``org.apache.cxf.staxutils``, ``org.apache.cxf.tools.corba.utils``, ``org.apache.cxf.tools.util``, ``org.apache.cxf.transform``, ``org.apache.directory.ldap.client.api``, ``org.apache.hadoop.fs``, ``org.apache.hadoop.hive.metastore``, ``org.apache.hadoop.hive.ql.exec``, ``org.apache.hadoop.hive.ql.metadata``, ``org.apache.hc.client5.http.async.methods``, ``org.apache.hc.client5.http.classic.methods``, ``org.apache.hc.client5.http.fluent``, ``org.apache.hive.hcatalog.templeton``, ``org.apache.ibatis.jdbc``, ``org.apache.ibatis.mapping``, ``org.apache.log4j``, ``org.apache.shiro.authc``, ``org.apache.shiro.codec``, ``org.apache.shiro.jndi``, ``org.apache.shiro.mgt``, ``org.apache.sshd.client.session``, ``org.apache.tools.ant``, ``org.apache.tools.zip``, ``org.codehaus.cargo.container.installer``, ``org.dom4j``, ``org.exolab.castor.xml``, ``org.fusesource.leveldbjni``, ``org.geogebra.web.full.main``, ``org.gradle.api.file``, ``org.ho.yaml``, ``org.influxdb``, ``org.jabsorb``, ``org.jboss.vfs``, ``org.jdbi.v3.core``, ``org.jenkins.ui.icon``, ``org.jenkins.ui.symbol``, ``org.keycloak.models.map.storage``, ``org.kohsuke.stapler``, ``org.lastaflute.web``, ``org.mvel2``, ``org.openjdk.jmh.runner.options``, ``org.owasp.esapi``, ``org.pac4j.jwt.config.encryption``, ``org.pac4j.jwt.config.signature``, ``org.scijava.log``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.libs.ws``, ``play.mvc``, ``ratpack.core.form``, ``ratpack.core.handling``, ``ratpack.core.http``, ``ratpack.exec``, ``ratpack.form``, ``ratpack.func``, ``ratpack.handling``, ``ratpack.http``, ``ratpack.util``, ``software.amazon.awssdk.transfer.s3.model``, ``sun.jvmstat.perfdata.monitor.protocol.local``, ``sun.jvmstat.perfdata.monitor.protocol.rmi``, ``sun.misc``, ``sun.net.ftp``, ``sun.net.www.protocol.http``, ``sun.security.acl``, ``sun.security.jgss.krb5``, ``sun.security.krb5``, ``sun.security.pkcs``, ``sun.security.pkcs11``, ``sun.security.provider``, ``sun.security.ssl``, ``sun.security.x509``, ``sun.tools.jconsole``",108,6034,757,131,6,14,18,,185
+    Others,"``actions.osgi``, ``antlr``, ``ch.ethz.ssh2``, ``cn.hutool.core.codec``, ``com.alibaba.com.caucho.hessian.io``, ``com.alibaba.druid.sql``, ``com.alibaba.fastjson2``, ``com.amazonaws.auth``, ``com.auth0.jwt.algorithms``, ``com.azure.identity``, ``com.caucho.burlap.io``, ``com.caucho.hessian.io``, ``com.cedarsoftware.util.io``, ``com.esotericsoftware.kryo.io``, ``com.esotericsoftware.kryo5.io``, ``com.esotericsoftware.yamlbeans``, ``com.hubspot.jinjava``, ``com.jcraft.jsch``, ``com.microsoft.sqlserver.jdbc``, ``com.mitchellbosecke.pebble``, ``com.opensymphony.xwork2``, ``com.sshtools.j2ssh.authentication``, ``com.sun.crypto.provider``, ``com.sun.jndi.ldap``, ``com.sun.net.httpserver``, ``com.sun.net.ssl``, ``com.sun.rowset``, ``com.sun.security.auth.module``, ``com.sun.security.ntlm``, ``com.sun.security.sasl.digest``, ``com.thoughtworks.xstream``, ``com.trilead.ssh2``, ``com.unboundid.ldap.sdk``, ``com.zaxxer.hikari``, ``flexjson``, ``hudson``, ``io.jsonwebtoken``, ``io.undertow.server.handlers.resource``, ``javafx.scene.web``, ``jenkins``, ``jodd.json``, ``liquibase.database.jvm``, ``liquibase.statement.core``, ``net.lingala.zip4j``, ``net.schmizz.sshj``, ``net.sf.json``, ``net.sf.saxon.s9api``, ``ognl``, ``org.acegisecurity``, ``org.antlr.runtime``, ``org.apache.avro``, ``org.apache.commons.codec``, ``org.apache.commons.compress.archivers.tar``, ``org.apache.commons.exec``, ``org.apache.commons.fileupload``, ``org.apache.commons.httpclient.util``, ``org.apache.commons.jelly``, ``org.apache.commons.jexl2``, ``org.apache.commons.jexl3``, ``org.apache.commons.lang``, ``org.apache.commons.logging``, ``org.apache.commons.net``, ``org.apache.commons.ognl``, ``org.apache.cxf.catalog``, ``org.apache.cxf.common.classloader``, ``org.apache.cxf.common.jaxb``, ``org.apache.cxf.common.logging``, ``org.apache.cxf.configuration.jsse``, ``org.apache.cxf.helpers``, ``org.apache.cxf.resource``, ``org.apache.cxf.staxutils``, ``org.apache.cxf.tools.corba.utils``, ``org.apache.cxf.tools.util``, ``org.apache.cxf.transform``, ``org.apache.directory.ldap.client.api``, ``org.apache.hadoop.fs``, ``org.apache.hadoop.hive.metastore``, ``org.apache.hadoop.hive.ql.exec``, ``org.apache.hadoop.hive.ql.metadata``, ``org.apache.hc.client5.http.async.methods``, ``org.apache.hc.client5.http.classic.methods``, ``org.apache.hc.client5.http.fluent``, ``org.apache.hive.hcatalog.templeton``, ``org.apache.ibatis.jdbc``, ``org.apache.ibatis.mapping``, ``org.apache.log4j``, ``org.apache.shiro.authc``, ``org.apache.shiro.codec``, ``org.apache.shiro.jndi``, ``org.apache.shiro.mgt``, ``org.apache.sshd.client.session``, ``org.apache.tools.ant``, ``org.apache.tools.zip``, ``org.codehaus.cargo.container.installer``, ``org.dom4j``, ``org.exolab.castor.xml``, ``org.fusesource.leveldbjni``, ``org.geogebra.web.full.main``, ``org.gradle.api.file``, ``org.ho.yaml``, ``org.influxdb``, ``org.jabsorb``, ``org.jboss.vfs``, ``org.jdbi.v3.core``, ``org.jenkins.ui.icon``, ``org.jenkins.ui.symbol``, ``org.keycloak.models.map.storage``, ``org.kohsuke.stapler``, ``org.lastaflute.web``, ``org.mvel2``, ``org.openjdk.jmh.runner.options``, ``org.owasp.esapi``, ``org.pac4j.jwt.config.encryption``, ``org.pac4j.jwt.config.signature``, ``org.scijava.log``, ``org.xml.sax``, ``org.xmlpull.v1``, ``play.libs.ws``, ``play.mvc``, ``ratpack.core.form``, ``ratpack.core.handling``, ``ratpack.core.http``, ``ratpack.exec``, ``ratpack.form``, ``ratpack.func``, ``ratpack.handling``, ``ratpack.http``, ``ratpack.util``, ``software.amazon.awssdk.transfer.s3.model``, ``sun.jvmstat.perfdata.monitor.protocol.local``, ``sun.jvmstat.perfdata.monitor.protocol.rmi``, ``sun.misc``, ``sun.net.ftp``, ``sun.net.www.protocol.http``, ``sun.security.acl``, ``sun.security.jgss.krb5``, ``sun.security.krb5``, ``sun.security.pkcs``, ``sun.security.pkcs11``, ``sun.security.provider``, ``sun.security.ssl``, ``sun.security.x509``, ``sun.tools.jconsole``",125,6034,795,147,6,14,18,,186
-    Totals,,363,26381,2684,404,16,137,33,1,409
+    Totals,,380,26381,2722,420,16,137,33,1,410
  • Changes to framework-coverage-java.csv:
+ org.apache.avro,38,17,,,,,,,,,,,,,,,,,,,,,,,,,16,,,,,,,,,1,,,,,21,,,,,,,,,17,,,

@jacknojo jacknojo force-pushed the jacknojo/add_llm_generated_mads_for_avro branch from bb270db to ae7f72d Compare May 5, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants