Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Example (dist) files, idempotent build and other fixes
  • Loading branch information
André L F S Bacci committed Dec 6, 2024
commit 43b9e9e02b947442ea2b2833c59a44e728fdf62a
19 changes: 19 additions & 0 deletions entities/global.ent-dist
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<?xml version="1.0" encoding="utf8" ?>
<!--
This is a "global" XML Entity file.
See doc-base/script/entities.php for details.

Place here only small entities that are expected NOT
being replaced or translated in any part of the manual.

DO NOT COPY OR TRANSLATE THIS FILE.

If you want/need to translate some entity placed here,
open an issue on doc-base or doc-en repository, so the
translatable entity is moved to manual.ent.
-->
<entities xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">

<!-- <entity name="ent.name">XML fragment</entity> -->

</entities>
14 changes: 14 additions & 0 deletions entities/manual.ent-dist
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<?xml version="1.0" encoding="utf8" ?>
<!-- $Revision:$ -->
<!--
This is a "manual" XML Entity file.
See doc-base/script/entities.php for details.

Place here only small entities that are expected to be
translated and/or replaced in any part of the manual.
-->
<entities xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">

<!-- <entity name="ent.name">XML fragment</entity> -->

</entities>
20 changes: 20 additions & 0 deletions entities/remove.ent-dist
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<?xml version="1.0" encoding="utf8" ?>
<!--
This is a "remove" XML Entity file.
See doc-base/script/entities.php for details.

Place here only deprecated entities that are expected NOT
be being used in any part of the manual.

DO NOT COPY OR TRANSLATE THIS FILE.

If an entity moved here is used in a language in your control,
this means that the entity on the original file is planned to be
removed, or already was removed. Consult the original text file
and apply changes accordly.
-->
<entities xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">

<!-- <entity name="ent.name">XML fragment</entity> -->

</entities>
116 changes: 59 additions & 57 deletions scripts/entities.php
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@

# Mental model, or things that I would liked to know 20 years prior

XML Entity processing has more in common with DOMDocumentFragment than
DTD Entity processing has more in common with DOMDocumentFragment than
DOMElement. In other words, simple text and multi rooted XML files
are valid <!ENTITY> contents, whereas they are not valid XML documents.

Also, namespaces do not automatically "cross" between a parent
document and their includes, even if they are included in the same
document and their entities, even if they are included in the same
file, as local textual entities. <!ENTITY>s are, for all intended
purposes, separated documents, with separated namespaces and have
*expected* different default namespaces.
Expand All @@ -36,11 +36,11 @@

# Output

This script collects bundled and individual entity files (detailed
This script collects grouped and individual entity files (detailed
below), at some expected relative paths, and generates an
.entities.ent file, in a sibling position to manual.xml.in.

The output .entities.ent file has no duplications, so collection
The output file .entities.ent has no duplications, so collection
order is important to keep the necessary operational semantics. Here,
newer loaded entities takes priority (overwrites) over previous one.
Note that this is the reverse of <!ENTITY> convention, where
Expand All @@ -49,52 +49,56 @@
are being overwriten, or if translatable entities are missing
translations.

# Individual tracked entities, or `.xml` files at `entities/`
# Individual XML Entities, or `.xml` files at `entities/`

As explained above, the individual entity contents are not really
valid XML *documents*, they are only at most valid XML *fragments*.

Yet, individual entities are stored in entities/ as .xml files, for
two reasons: first, text editors in general can highlights XML syntax,
and second, this allows normal revision tracking on then, without
requiring weird changes on `revcheck.php`.
even for XML fragments, and second, this allows normal revision tracking
per file, without requiring weird changes on `revcheck.php`. Note that
is *invalid* to place XML declaration in these fragment files, at least
in files that are invalid XML documents (on multi node rooted ones).

# Bundled entities files, group tracked
# Grouped entities files, file tracked

For very small textual entities, down to simple text words or single
tag elements, that may never change, individual entity tracking is
an overkill. This script also loads bundled entities files, at
an overkill. This script also loads grouped XML Entities files, at
some expected locations, with specific semantics.

These bundle files are really normal XML files, correctly annotated
These grouped files are really normal XML files, correctly annotated
with XML namespaces used on manual, so any individual exported entity
have corret XML namespace annotations. These bundle entity files
are revcheck tracked normaly, but are not included in manual.xml.in,
as they only participate in general entity loading, described above.
have correct anc clean XML namespace annotations. These grouped entity
files are tracked normally by revcheck, but are not directly included
in manual.xml.in, as they only participate in general entity loading,
described above.

- global.ent - expected untranslated
- manual.ent - expected translated
- lang/entities/* - expected translated
- global.ent - expected unreplaced
- manual.ent - expected replaced (translated)
- remove.ent - expected unused
- lang/entities/* - expected replaced (translated)

*/

const PARTIAL_IMPL = true; // For while XML Entities are not fully implanted in all languages

ini_set( 'display_errors' , 1 );
ini_set( 'display_startup_errors' , 1 );
error_reporting( E_ALL );

const PARTIAL_IMPL = true; // For while spliting and bundle convertion are incomplete

if ( count( $argv ) < 2 || in_array( '--help' , $argv ) || in_array( '-h' , $argv ) )
{
fwrite( STDERR , "\nUsage: {$argv[0]} [--debug] entitiesDir [entitiesDir]\n\n" );
fwrite( STDERR , "\nUsage: {$argv[0]} [--debug] langCode [langCode]\n\n" );
return;
}

$filename = Entities::rotateOutputFile();
$filename = Entities::rotateOutputFile(); // idempotent

$langs = [];
$normal = true; // configure.php mode
$debug = false; // detailed output
$normal = true; // Normal configure.php mode
$debug = false; // Detailed console mode

for( $idx = 1 ; $idx < count( $argv ) ; $idx++ )
if ( $argv[$idx] == "--debug" )
Expand Down Expand Up @@ -125,10 +129,10 @@
echo " done: " , Entities::$countTotalGenerated , " entities";
if ( Entities::$countUnstranslated > 0 )
echo ", " , Entities::$countUnstranslated , " untranslated";
if ( Entities::$countConstantReplaced > 0 )
echo ", " , Entities::$countConstantReplaced , " global replaced";
if ( Entities::$countRemoveReplaced > 0 )
echo ", " , Entities::$countRemoveReplaced , " to be removed";
if ( Entities::$countReplacedGlobal > 0 )
echo ", " , Entities::$countReplacedGlobal , " global replaced";
if ( Entities::$countReplacedRemove > 0 )
echo ", " , Entities::$countReplacedRemove , " remove replaced";
echo ".\n";

exit;
Expand All @@ -143,19 +147,19 @@ public function __construct(

class Entities
{
public static int $countConstantReplaced = 0;
public static int $countUnstranslated = 0;
public static int $countRemoveReplaced = 0;
public static int $countReplacedGlobal = 0;
public static int $countReplacedRemove = 0;
public static int $countTotalGenerated = 0;

private static string $filename = __DIR__ . "/../.entities.ent"; // sibling of .manual.xml
private static string $filename = __DIR__ . "/../temp/entities.ent"; // idempotent

private static array $entities = []; // All entities, overwriten
private static array $global = []; // Entities from global.ent files
private static array $global = []; // Entities expected not replaced
private static array $replace = []; // Entities expected replaced / translated
private static array $remove = []; // Entities expected removed
private static array $remove = []; // Entities expected not replaced and not used
private static array $count = []; // Name / Count
private static array $slow = []; // External entities, slowless, overwrite
private static array $slow = []; // External entities, slow, uncontroled overwrite

static function put( string $path , string $name , string $text , bool $global = false , bool $replace = false , bool $remove = false )
{
Expand Down Expand Up @@ -189,7 +193,6 @@ static function rotateOutputFile()
if ( file_exists( Entities::$filename ) )
unlink( Entities::$filename );
touch( Entities::$filename );

Entities::$filename = realpath( Entities::$filename ); // only full paths on XML
}

Expand All @@ -201,51 +204,51 @@ static function writeOutputFile()
static function checkReplaces( bool $debug )
{
Entities::$countTotalGenerated = count( Entities::$entities );
Entities::$countConstantReplaced = 0;
Entities::$countUnstranslated = 0;
Entities::$countRemoveReplaced = 0;
Entities::$countReplacedGlobal = 0;
Entities::$countReplacedRemove = 0;

foreach( Entities::$entities as $name => $text )
{
$replaced = Entities::$count[$name] - 1;
$expectedConstant = in_array( $name , Entities::$global );
$expectedGlobal = in_array( $name , Entities::$global );
$expectedReplaced = in_array( $name , Entities::$replace );
$expectedRemoved = in_array( $name , Entities::$remove );

if ( $expectedConstant && $replaced != 0 )
if ( $expectedGlobal && $replaced != 0 )
{
Entities::$countConstantReplaced++;
Entities::$countReplacedGlobal++;
if ( $debug )
print "Expected global, replaced $replaced times:\t$name\n";
print "Expected global, replaced $replaced times: $name\n";
}

if ( $expectedReplaced && $replaced != 1 )
{
Entities::$countUnstranslated++;
if ( $debug )
print "Expected translated, replaced $replaced times:\t$name\n";
print "Expected translated, replaced $replaced times: $name\n";
}

if ( $expectedRemoved && $replaced != 0 )
{
Entities::$countRemoveReplaced++;
Entities::$countReplacedRemove++;
if ( $debug )
print "Expected removed, replaced $replaced times:\t$name\n";
print "Expected removed, replaced $replaced times: $name\n";
}
}
}
}

function loadEnt( string $path , bool $global = false , bool $translate = false , bool $remove = false , bool $warnMissing = false )
{
$absolute = realpath( $path );
if ( $absolute === false )
$realpath = realpath( $path );
if ( $realpath === false )
if ( PARTIAL_IMPL )
return;
else
if ( $warnMissing )
fwrite( STDERR , "\n Missing entity file: $path\n" );
$path = $absolute;
$path = $realpath;

$text = file_get_contents( $path );
$text = str_replace( "&" , "&amp;" , $text );
Expand All @@ -259,7 +262,7 @@ function loadEnt( string $path , bool $global = false , bool $translate = false

foreach( $list as $ent )
{
// weird, namespace correting, DOMNodeList -> DOMDocumentFragment
// weird, namespace correting, DOMNodeList -> DOMDocumentFragment transform
$other = new DOMDocument( '1.0' , 'utf8' );

foreach( $ent->childNodes as $node )
Expand All @@ -268,8 +271,8 @@ function loadEnt( string $path , bool $global = false , bool $translate = false
$name = $ent->getAttribute( "name" );
$text = $other->saveXML();

$text = str_replace( "&amp;" , "&" , $text );
$text = rtrim( $text , "\n" );
$text = str_replace( "&amp;" , "&" , $text );
$lines = explode( "\n" , $text );
array_shift( $lines ); // remove XML declaration
$text = implode( "\n" , $lines );
Expand All @@ -292,7 +295,7 @@ function loadDir( array $langs , string $lang )
return;
}
else
exit( "Not directory: $dir\n" );
exit( "Error: not a directory: $dir\n" );

$files = scandir( $dir );
$expectedReplaced = array_search( $lang , $langs ) > 0;
Expand All @@ -301,10 +304,10 @@ function loadDir( array $langs , string $lang )
{
$path = realpath( "$dir/$file" );

if ( is_dir( $path ) )
continue;
if ( str_starts_with( $file , '.' ) )
continue;
if ( is_dir( $path ) )
continue;

$text = file_get_contents( $path );
$text = rtrim( $text , "\n" );
Expand All @@ -315,18 +318,17 @@ function loadDir( array $langs , string $lang )

function loadXml( string $path , string $text , bool $expectedReplaced )
{
$info = pathinfo( $path );
$name = $info["filename"];
$frag = "<frag>$text</frag>";

if ( trim( $text ) == "" )
{
fwrite( STDERR , "\n Empty entity (should it be in remove.ent?): '$path' \n" );
Entities::put( $pat , $text , remove: true );
Entities::put( $path , $name , $text );
return;
}

$info = pathinfo( $path );
$name = $info["filename"];

$frag = "<frag>$text</frag>";

$dom = new DOMDocument( '1.0' , 'utf8' );
$dom->recover = true;
$dom->resolveExternals = false;
Expand Down Expand Up @@ -354,7 +356,7 @@ function loadXml( string $path , string $text , bool $expectedReplaced )

function saveEntitiesFile( string $filename , array $entities )
{
$tmpDir = __DIR__ . "/entities";
$tmpDir = __DIR__ . "/temp"; // idempotent

$file = fopen( $filename , "w" );
fputs( $file , "\n<!-- DO NOT COPY / DO NOT TRANSLATE - Autogenerated by entities.php -->\n\n" );
Expand Down