Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encountering a Problem with CodeQL-ruby Query during the Execution Phase of the epsilonStar Function #15199

Open
Tracked by #15201
spingARbor opened this issue Dec 23, 2023 · 4 comments
Labels
question Further information is requested

Comments

@spingARbor
Copy link

spingARbor commented Dec 23, 2023

Dear Sir/Madam,

I'm a novice CodeQL user looking to utilize the CodeQL-ruby tool to assist me in conducting a GitLab code audit. However, while using CodeQL (codeql-cli-v2.15.4) to query remotesourceflow, I've encountered an problem where the query process appears to be stuck in the execution phase of the epsilonStar function (I've waited for 12 hours with no visible progress).

I noticed that the epsilonStar function was introduced in June of this year. In an attempt to address the problem, I switched to version 2.13.3, which doesn't include this function. Interestingly, using the same query in this version yielded smooth and successful results.

Given my recent introduction to CodeQL, my understanding of the epsilonStar function's functionality is limited. As a result, I'm unsure if this issue is a result of my query approach or if there might be a certain flaw in the current functionality.

I have attached the query code I used and a screenshot of the runtime situation for your reference. I would greatly appreciate any guidance or assistance you could provide.

Thank you once again for your support.

Best regards.

/**
 * @name Find all Ruby RemoteFlowSources in a project
 * @description This query finds all sensitivemethod definitions in a Ruby project.
 * @id rb/examples/mytaint1
 */

 import codeql.ruby.AST
 import codeql.ruby.DataFlow
 import codeql.ruby.dataflow.RemoteFlowSources
   
  class PathtravalConfig extends DataFlow::Configuration {
    PathtravalConfig() { this = "PathtravalConfig" }
   
    override predicate isSource(DataFlow::Node source) {
      source instanceof RemoteFlowSource
    }
   
    // get sinks
    override predicate isSink(DataFlow::Node sink) {
      exists(Method method|
          sink.asParameter() = method.getAParameter())    
      }
  }
  from DataFlow::PathNode source, DataFlow::PathNode sink, PathtravalConfig conf
  where conf.hasFlowPath(source, sink)
  select sink.getNode(), source, sink, "Potential sensitive operations involving $@.", source.getNode(),
    "this specific variable"
@spingARbor
Copy link
Author

issue

@mbg
Copy link
Member

mbg commented Dec 27, 2023

Hi @spingARbor 👋

Thanks for asking this question!

I suspect that the most likely explanation here is that the query you have written is just extremely complex to run. You are essentially trying to find all data flow paths between any RemoteFlowSource and any other location where it flows to as an argument. On any non-trivial codebase, you can easily run into performance problems with that. Even if the performance was fine, I would not expect the results of this query to be particularly useful.

It's probably worth thinking more about what you are actually interested in and write more specific sources or sinks for that to reduce the number of results your query produces. Let me know if you need any help with that!

@spingARbor
Copy link
Author

Happy New Year, sir!@mbg
Thank you for your response.!
While constructing the entire query, I also attempted to use 'Quick Evaluation: isSource' to query only the results for RemoteFlowSource, but I still encountered the same issue.

@mbg
Copy link
Member

mbg commented Jan 3, 2024

Hi @spingARbor,

Even though you are intending to just evaluate isSource, CodeQL likely still evaluates other predicates in the same class/etc. as well. To verify this, I would suggest that you temporarily comment out everything but your isSource predicate so that you have just:

import codeql.ruby.AST
import codeql.ruby.DataFlow
import codeql.ruby.dataflow.RemoteFlowSources
   
predicate isSource(DataFlow::Node source) {
  source instanceof RemoteFlowSource
}

You can then evaluate just this. I would expect this to yield results, even with a large database. If this still doesn't work, then there might be something else going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants