Skip to content

Commit 20dea23

Browse files
committed
move module-info.java out of the java directory to avoid problems in eclipse
1 parent 25e48a7 commit 20dea23

5 files changed

Lines changed: 311 additions & 1 deletion

File tree

AGENTS.md

Lines changed: 259 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
# AGENTS.md — HtmlUnit-CSSParser
2+
3+
## Project Overview
4+
5+
HtmlUnit-CSSParser is a **CSS parser for Java** that reads CSS source text and produces a DOM-style object tree. It is the CSS parser powering [HtmlUnit](https://www.htmlunit.org/) since version 1.30. The project originated as a fork of [CSSParser 0.9.25](http://cssparser.sourceforge.net/), with the SAC (`org.w3c.css.sac`) dependency removed and a more flexible object model introduced.
6+
7+
- **Group/Artifact:** `org.htmlunit:htmlunit-cssparser`
8+
- **License:** Apache License 2.0
9+
- **Default branch:** `master`
10+
- **Java version:** JDK 17+ (version 5.x, current development); JDK 8+ for 4.x releases
11+
- **Build system:** Maven
12+
13+
## Repository Structure
14+
15+
```
16+
htmlunit-cssparser/
17+
├── pom.xml # Maven build configuration
18+
├── checkstyle.xml # Checkstyle rules (enforced on build)
19+
├── checkstyle_suppressions.xml # Checkstyle suppression rules
20+
├── README.md
21+
├── LICENSE # Apache 2.0
22+
├── .github/
23+
│ ├── workflows/
24+
│ │ └── codeql.yml # CodeQL security scanning (Java)
25+
│ ├── dependabot.yml # Dependabot dependency updates
26+
│ └── FUNDING.yml # Sponsorship info
27+
├── src/
28+
│ ├── main/
29+
│ │ ├── java/org/htmlunit/cssparser/
30+
│ │ │ ├── dom/ # CSS DOM implementation classes
31+
│ │ │ ├── parser/ # Core parser classes
32+
│ │ │ │ ├── condition/ # CSS selector conditions
33+
│ │ │ │ ├── selector/ # CSS selector model
34+
│ │ │ │ └── media/ # Media query support
35+
│ │ │ └── util/ # Utility classes
36+
│ │ └── javacc/
37+
│ │ └── CSS3Parser.jj # JavaCC grammar file (generates the parser)
38+
│ └── test/
39+
│ ├── java/ # JUnit 5 test classes
40+
│ └── resources/ # CSS test fixture files
41+
└── target/ # Build output (not committed)
42+
```
43+
44+
## Build and Test
45+
46+
### Prerequisites
47+
48+
- **Maven 3.6.3+**
49+
- **JDK 17+** (for current master / version 5.x)
50+
51+
### Commands
52+
53+
```bash
54+
# Compile (this also runs JavaCC to generate the parser from CSS3Parser.jj)
55+
mvn compile
56+
57+
# Run all tests
58+
mvn test
59+
60+
# Full build with checkstyle verification
61+
mvn -U clean test
62+
63+
# Check for dependency/plugin updates
64+
mvn versions:display-plugin-updates
65+
mvn versions:display-dependency-updates
66+
```
67+
68+
### Generated Code
69+
70+
The CSS parser is generated from a **JavaCC grammar file** at `src/main/javacc/CSS3Parser.jj`. During the `generate-sources` phase, the `ph-javacc-maven-plugin` generates Java source files into `target/generated-sources/javacc/org/htmlunit/cssparser/parser/javacc/`. A post-processing step using the `maven-replacer-plugin` cleans up the generated code (removes dead code patterns produced by JavaCC).
71+
72+
**Do not manually edit files in `target/generated-sources/`** — they are regenerated on every build. If parser behavior needs to change, edit `src/main/javacc/CSS3Parser.jj`.
73+
74+
## Architecture and Key Packages
75+
76+
### `org.htmlunit.cssparser.parser` — Core Parser
77+
78+
The main entry point for users. Key classes:
79+
80+
| Class | Purpose |
81+
|---|---|
82+
| `CSSOMParser` | High-level parser that produces a DOM-style tree from CSS input. Main public API. |
83+
| `AbstractCSSParser` | Base class with shared parsing logic; `CSS3Parser` (generated) extends this. |
84+
| `InputSource` | Wraps a `Reader` to feed CSS text to the parser. Replaces the old SAC `InputSource`. |
85+
| `LexicalUnit` / `LexicalUnitImpl` | Represents CSS values (lengths, colors, functions, etc.) as a linked list of lexical tokens. |
86+
| `CSSErrorHandler` | Interface for custom error handling during parsing. Replaces the old SAC `ErrorHandler`. |
87+
| `CSSException` / `CSSParseException` | Exception types for parse errors. |
88+
| `DocumentHandler` / `HandlerBase` | Event-based (SAX-like) callback interface for streaming CSS parsing. |
89+
| `Locator` / `Locatable` | Source location tracking (line/column numbers). |
90+
91+
### `org.htmlunit.cssparser.parser.selector` — Selector Model
92+
93+
Represents CSS selectors as an object model:
94+
95+
- `Selector`, `SimpleSelector` — base types
96+
- `ElementSelector` — type selectors (`h1`, `div`, `*`)
97+
- `DescendantSelector`, `ChildSelector` — combinators (` `, `>`)
98+
- `DirectAdjacentSelector`, `GeneralAdjacentSelector` — combinators (`+`, `~`)
99+
- `PseudoElementSelector` — pseudo-elements (`::before`, `::after`)
100+
- `RelativeSelector` — for `:has()` relative selectors
101+
- `SelectorList` / `SelectorListImpl` — ordered list of selectors
102+
- `SelectorSpecificity` — calculates selector specificity
103+
- `Combinator` — enum of CSS combinator types
104+
105+
### `org.htmlunit.cssparser.parser.condition` — Selector Conditions
106+
107+
Conditions attached to selectors (class, id, attribute, pseudo-class matching):
108+
109+
- `ClassCondition` (`.foo`), `IdCondition` (`#bar`)
110+
- `AttributeCondition` (`[attr=val]`), `PrefixAttributeCondition` (`[attr^=val]`), `SuffixAttributeCondition` (`[attr$=val]`), `SubstringAttributeCondition` (`[attr*=val]`), `OneOfAttributeCondition` (`[attr~=val]`), `BeginHyphenAttributeCondition` (`[attr|=val]`)
111+
- `PseudoClassCondition` (`:hover`, `:nth-child()`, etc.)
112+
- `NotPseudoClassCondition` (`:not()`), `IsPseudoClassCondition` (`:is()`), `HasPseudoClassCondition` (`:has()`), `WherePseudoClassCondition` (`:where()`)
113+
- `LangCondition` (`:lang()`)
114+
115+
### `org.htmlunit.cssparser.parser.media` — Media Queries
116+
117+
- `MediaQuery` — a single media query (`screen and (min-width: 768px)`)
118+
- `MediaQueryList` — a list of media queries
119+
120+
### `org.htmlunit.cssparser.dom` — CSS DOM Implementation
121+
122+
Implements a CSS object model (style sheets, rules, values):
123+
124+
- `CSSStyleSheetImpl` — represents a complete stylesheet
125+
- `CSSStyleRuleImpl` — a style rule (`selector { declarations }`)
126+
- `CSSStyleDeclarationImpl` — a set of property declarations
127+
- `CSSMediaRuleImpl`, `CSSImportRuleImpl`, `CSSPageRuleImpl`, `CSSFontFaceRuleImpl`, `CSSCharsetRuleImpl`, `CSSUnknownRuleImpl` — at-rule implementations
128+
- `CSSRuleListImpl` — ordered list of rules
129+
- `CSSValueImpl` — wraps parsed CSS values
130+
- `Property` — a single CSS property with name, value, and priority
131+
- Color classes: `RGBColorImpl`, `HSLColorImpl`, `HWBColorImpl`, `LABColorImpl`, `LCHColorImpl` (plus `AbstractColor` base)
132+
- `RectImpl`, `CounterImpl` — CSS `rect()` and `counter()` value types
133+
- `MediaListImpl`, `CSSStyleSheetListImpl` — list types
134+
- `DOMExceptionImpl` — DOM exception handling
135+
136+
### `org.htmlunit.cssparser.util` — Utilities
137+
138+
- `ParserUtils` — string processing helpers used by the generated parser (trimming, unescaping)
139+
140+
## Code Style and Quality
141+
142+
### Checkstyle
143+
144+
Checkstyle is **strictly enforced** via `checkstyle.xml` and runs during the build. Key rules:
145+
146+
- **Line length:** 120 characters max
147+
- **Indentation:** 4-space tabs
148+
- **Braces:** opening brace on same line (`eol`), closing brace on its own line (`alone`)
149+
- **Naming conventions:**
150+
- Member fields: `camelCase_` (trailing underscore)
151+
- Static fields: `CamelCase_` (capital start, trailing underscore)
152+
- Constants: `UPPER_SNAKE_CASE` (exception: `log`)
153+
- Methods: `camelCase` (test methods may use underscores: `test[A-Z][a-zA-Z0-9_]+`)
154+
- Catch parameters: `e`, `ex`, `ignored`, or `expected`
155+
- **Javadoc:** Required on all public/protected methods, types, and packages. Author tag format: `@author Firstname Lastname`
156+
- **Imports:** No star imports, no unused imports, no redundant imports
157+
- **License header:** Required on every source file:
158+
```
159+
/*
160+
* Copyright (c) 2019-2026 Ronald Brill.
161+
*
162+
* Licensed under the Apache License, Version 2.0 ...
163+
*/
164+
```
165+
- **No `serialVersionUID`** fields
166+
- **No `@version`** tags
167+
- **No `System.out`/`System.err`** in production code
168+
- **Final local variables** and parameters are enforced
169+
- **No trailing whitespace**, no tab characters, no double blank lines
170+
- Single empty line after package declaration, none before it
171+
172+
Checkstyle suppressions (`checkstyle_suppressions.xml`):
173+
- Test files are exempt from `JavadocPackage`, `JavadocMethod`, and `LineLength`
174+
- Generated files in `target/generated-sources/javacc` are fully exempt
175+
- `CssCharStream.java` is fully exempt (special character stream handling)
176+
177+
### Testing
178+
179+
- **Framework:** JUnit Jupiter (JUnit 5), version 6.x
180+
- **Test dependency:** `commons-io` (test scope only)
181+
- **Test resources:** CSS fixture files in `src/test/resources/`
182+
- **Run tests:** `mvn test` (uses `maven-surefire-plugin`)
183+
184+
## CI/CD
185+
186+
- **CodeQL:** GitHub Actions workflow (`.github/workflows/codeql.yml`) runs security analysis on pushes/PRs to `master` and weekly (Mondays 23:34 UTC). Analyzes Java code only.
187+
- **Dependabot:** Configured via `.github/dependabot.yml` for automated dependency update PRs.
188+
- **Jenkins:** Primary CI runs on an external Jenkins server at `https://jenkins.wetator.org/job/HtmlUnit%20-%20CSS%20Parser/`.
189+
190+
## Making Changes
191+
192+
### Modifying Parser Behavior
193+
194+
1. Edit the JavaCC grammar: `src/main/javacc/CSS3Parser.jj`
195+
2. Run `mvn compile` to regenerate and compile
196+
3. Add/update tests to cover the change
197+
4. Run `mvn test` to verify
198+
199+
### Adding Support for New CSS Features
200+
201+
New CSS features typically require changes in multiple layers:
202+
203+
1. **Grammar** (`CSS3Parser.jj`) — add token definitions and production rules
204+
2. **Lexical units** (`LexicalUnit.java`, `LexicalUnitImpl.java`) — add new `LexicalUnitType` enum values if needed
205+
3. **Conditions** (`parser/condition/`) — for new pseudo-classes or attribute selectors
206+
4. **Selectors** (`parser/selector/`) — for new selector types or combinators
207+
5. **DOM** (`dom/`) — for new at-rule types or value types
208+
6. **Tests** — comprehensive tests for parsing, serialization, and error handling
209+
210+
### Code Conventions for PRs
211+
212+
- Run `mvn -U clean test` and ensure all tests pass
213+
- Run checkstyle: it's part of the build; fix all violations
214+
- Follow the naming conventions (especially trailing underscores on fields)
215+
- Add Javadoc to all new public/protected API
216+
- Keep the license header on all new files
217+
- Do not modify generated files in `target/`
218+
219+
## Versioning and Releases
220+
221+
- **Current development:** 5.0.0-SNAPSHOT (requires JDK 17+)
222+
- **Latest stable:** 4.21.0 (December 2025, JDK 8+)
223+
- **Artifacts:** Published to Maven Central via Sonatype Central Publishing
224+
- **Release process:** (from README)
225+
1. Ensure all tests pass
226+
2. Update version in `pom.xml` and `README.md`
227+
3. Commit, build, and deploy: `mvn -up clean deploy`
228+
4. Publish on Maven Central Portal
229+
5. Create GitHub release with signed JARs
230+
6. Bump to next SNAPSHOT version
231+
232+
## Dependencies
233+
234+
### Runtime
235+
236+
**None.** The library has zero runtime dependencies — it is completely self-contained.
237+
238+
### Test Only
239+
240+
- `org.junit.jupiter:junit-jupiter-engine`
241+
- `org.junit.platform:junit-platform-launcher`
242+
- `commons-io:commons-io`
243+
244+
## Key Design Decisions
245+
246+
1. **No SAC dependency:** The `org.w3c.css.sac` API (stalled since 2008) was removed. All interfaces are built-in, giving the project full control over the object model.
247+
2. **JavaCC-based parser:** The CSS grammar is defined in `CSS3Parser.jj` and compiled by JavaCC. This provides robust, specification-aligned tokenization and parsing.
248+
3. **Event-based + DOM-based API:** The parser supports both SAX-like streaming (`DocumentHandler`) and tree-building (`CSSOMParser`) usage patterns.
249+
4. **Zero runtime dependencies:** Makes the library safe to embed anywhere without dependency conflicts.
250+
251+
## Links
252+
253+
- **Repository:** https://github.com/HtmlUnit/htmlunit-cssparser
254+
- **Maven Central:** https://central.sonatype.com/artifact/org.htmlunit/htmlunit-cssparser
255+
- **HtmlUnit:** https://www.htmlunit.org/
256+
- **Developer Blog:** https://htmlunit.github.io/htmlunit-blog/
257+
- **CI:** https://jenkins.wetator.org/job/HtmlUnit%20-%20CSS%20Parser/
258+
- **Sponsor:** https://github.com/sponsors/rbri
259+
- **Predecessor:** http://cssparser.sourceforge.net/

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

pom.xml

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,12 @@
2828
<commons-io.version>2.21.0</commons-io.version>
2929

3030
<!-- quality -->
31-
<checkstyle.version>13.0.0</checkstyle.version>
31+
<checkstyle.version>12.3.1</checkstyle.version>
3232
<dependencycheck.version>10.0.4</dependencycheck.version>
3333

3434
<!-- plugins -->
3535
<central-publishing-plugin.version>0.10.0</central-publishing-plugin.version>
36+
<build-helper-plugin.version>3.6.1</build-helper-plugin.version>
3637
<checkstyle-plugin.version>3.6.0</checkstyle-plugin.version>
3738
<gpg-plugin.version>3.2.8</gpg-plugin.version>
3839
<enforcer-plugin.version>3.6.2</enforcer-plugin.version>
@@ -46,6 +47,26 @@
4647

4748
<build>
4849
<plugins>
50+
<plugin>
51+
<groupId>org.codehaus.mojo</groupId>
52+
<artifactId>build-helper-maven-plugin</artifactId>
53+
<version>${build-helper-plugin.version}</version>
54+
<executions>
55+
<execution>
56+
<id>add-module-source</id>
57+
<phase>generate-sources</phase>
58+
<goals>
59+
<goal>add-source</goal>
60+
</goals>
61+
<configuration>
62+
<sources>
63+
<source>src/main/module-info</source>
64+
</sources>
65+
</configuration>
66+
</execution>
67+
</executions>
68+
</plugin>
69+
4970
<plugin>
5071
<groupId>org.apache.maven.plugins</groupId>
5172
<artifactId>maven-enforcer-plugin</artifactId>

src/main/module-info/readme.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Module Info Directory
2+
3+
This directory contains the `module-info.java` file for the Java Platform Module System (JPMS).
4+
5+
## Why is this in a separate directory?
6+
7+
The `module-info.java` file is kept separate from `src/main/java/` for the following reasons:
8+
9+
1. **Eclipse IDE Compatibility**: Eclipse has known issues with Java modules, particularly when the project uses older Java versions or has complex module configurations. Keeping `module-info.java` separate prevents Eclipse from attempting to process it during development.
10+
11+
2. **Build-Time Integration**: The module descriptor is added to the compilation during the Maven build process via the `build-helper-maven-plugin`, which adds this directory as a source folder only during the build.
12+
13+
3. **IDE Independence**: This approach allows developers to use Eclipse without module-related compilation errors while still producing proper modular JARs when building with Maven.
14+
15+
## How it works
16+
17+
The Maven build process:
18+
1. Compiles all regular Java sources from `src/main/java/`
19+
2. Adds `src/main/module-info/` as an additional source directory
20+
3. Compiles `module-info.java`
21+
4. Packages everything into a proper modular JAR
22+
23+
## For Developers
24+
25+
- **Eclipse users**: You don't need to worry about this file - Eclipse won't see it
26+
- **IntelliJ users**: IntelliJ handles modules better and can work with this setup
27+
- **Maven builds**: The module descriptor is automatically included in the final JAR
28+
29+
If you need to modify the module descriptor, edit `src/main/module-info/module-info.java` directly.

0 commit comments

Comments
 (0)