Skip to content
  • speters's avatar
    [BUGFIX] Handle large number of pages in linkvalidator · 7fecd98c
    speters authored and Christian Kuhn's avatar Christian Kuhn committed
    This patch handles problems which occured with a large number of pages
    and also for a large number of records to be checked for broken links.
    The problems previously resulted in "Prepared statement contains too
    many placeholders" exceptions and consumption of too much RAM and
    (possibly) out of memory errors.
    
    Technical details about the implementation follows because this is
    handled slightly differently than in other areas of the core.
    
    LinkValidator used named placeholders to build database queries,
    and the EditableRestriction class also used a number of placeholder.
    
    Technically 'doctrine/dbal' explodes named placeholders for array
    values with quote type 'PARAM_INT_ARRAY' or 'PARAM_STR_ARRAY' to
    single unnamed placeholders for each value, using one placeholder
    for each array value. If multiple array value placeholder are used
    with a large number of values this could quickly emit the "Prepared
    statement contains too many placeholders" error as the limit is
    reached. The limit differs beetween the various database systems.
    One way to work around this issue is to do database operations in
    chunks, but it could become quite difficult to calculate the chunk
    size beforehand.
    
    Array values are mainly used for 'in()' or 'notIn()' query expressions,
    which naturally do not not have direct limits for the value count in it
    through all supported dbms systems. This opens the possibility to use
    direct value lists for these expressions. There are other dbms
    limitations depending on the used dbms (for example on mariadb/mysql
    'max_allowed_packets', max query size setting for SQLite), which may be
    adjusted in some cases. However we could not safely expect that these
    values are monitored and adjusted, so we still need to do database
    operations in chunks, but make it a little easier for 'chunksize'
    calculations.
    
    During testing on an instance with a large number of pages, it was
    found that LinkValidator used up a lot of memory, reaching the
    configured memory limit of 2G on that system, which itself was very
    high already (and far beyond the recommended 256MB limit).
    Instead of collecting unlimited results in an array before checking
    links against this array, doing this for each record directly avoids
    excessive memory consumption during link checking.
    
    It was decided to do this in this patch directly instead of a separate
    patch, as this patch otherwise would not be testable in real
    instances.
    
    This patch does the following:
    
    * introduce two method to implode value arrays in QueryHelper
    * replace placeholder in EditableRestriction with quoted values
    * replace placeholder in BrokenLinksRepository with quoted values
    * introduce chunk database operations in BrokenLinkRepository
    * introduce chunk database operations in LinkAnalyzer
    * move $results array and call checkLinks() on it to the inner
      loop to process this for each record (instead of accumulating the
      results in a huge array.
    
    In the future we need to investigate and find a more generic way to
    handle such scenarios for larger instances. This may take some time
    and investigation to do this properly and in a reusable way. To not
    stall the bugfix further, we are doing the workarounds in this patch
    to solve these issues in the meantime.
    
    Resolves: #92493
    Releases: master, 11.5
    Change-Id: I9a596df6749d07afee9c5ac37266b876ddc6e297
    Reviewed-on: https://review.typo3.org/c/Packages/TYPO3.CMS/+/66896
    
    
    Tested-by: default avatarcore-ci <typo3@b13.com>
    Tested-by: default avatarStefan Bürk <stefan@buerk.tech>
    Tested-by: default avatarChristian Kuhn <lolli@schwarzbu.ch>
    Reviewed-by: default avatarStefan Bürk <stefan@buerk.tech>
    Reviewed-by: default avatarChristian Kuhn <lolli@schwarzbu.ch>
    7fecd98c