Html
        
        extends BaseReader
    
    
            
            in package
            
        
    
    
    
Table of Contents
Constants
- TEST_SAMPLE_SIZE = 2048
- Sample size to read to determine if it's HTML or not.
- FORMATS = [ 'h1' => ['font' => ['bold' => true, 'size' => 24]], // Bold, 24pt 'h2' => ['font' => ['bold' => true, 'size' => 18]], // Bold, 18pt 'h3' => ['font' => ['bold' => true, 'size' => 13.5]], // Bold, 13.5pt 'h4' => ['font' => ['bold' => true, 'size' => 12]], // Bold, 12pt 'h5' => ['font' => ['bold' => true, 'size' => 10]], // Bold, 10pt 'h6' => ['font' => ['bold' => true, 'size' => 7.5]], // Bold, 7.5pt 'a' => ['font' => ['underline' => true, 'color' => ['argb' => \PhpOffice\PhpSpreadsheet\Style\Color::COLOR_BLUE]]], // Blue underlined 'hr' => ['borders' => ['bottom' => ['borderStyle' => \PhpOffice\PhpSpreadsheet\Style\Border::BORDER_THIN, 'color' => [\PhpOffice\PhpSpreadsheet\Style\Color::COLOR_BLACK]]]], // Bottom border 'strong' => ['font' => ['bold' => true]], // Bold 'b' => ['font' => ['bold' => true]], // Bold 'i' => ['font' => ['italic' => true]], // Italic 'em' => ['font' => ['italic' => true]], ]
- Formats.
Properties
- $allowExternalImages : bool
- Allow external images. Use with caution.
- $createBlankSheetIfNoneRead : bool
- Create a blank sheet if none are read, possibly due to a typo when using LoadSheetsOnly.
- $dataArray : array<string|int, array<string|int, mixed>>
- Data Array used for testing only, should write to Spreadsheet object on completion of tests.
- $fileHandle : resource
- $ignoreRowsWithNoCells : bool
- Ignore rows with no cells? Identifies whether the Reader should ignore rows with no cells.
- $includeCharts : bool
- Read charts that are defined in the workbook? Identifies whether the Reader should read the definitions for any charts that exist in the workbook;.
- $inputEncoding : string
- Input encoding.
- $loadSheetsOnly : null|array<string|int, string>
- Restrict which sheets should be loaded? This property holds an array of worksheet names to be loaded. If null, then all worksheets will be loaded.
- $nestedColumn : array<string|int, string>
- $readDataOnly : bool
- Read data only? Identifies whether the Reader should only read data values for cells, and ignore any formatting information; or whether it should read both data and formatting.
- $readEmptyCells : bool
- Read empty cells? Identifies whether the Reader should read data values for all cells, or should ignore cells containing null value or empty string.
- $readFilter : IReadFilter
- IReadFilter instance.
- $rowspan : array<string, bool>
- $securityScanner : XmlScanner|null
- $sheetIndex : int
- Sheet index to read.
- $tableLevel : int
- $valueBinder : IValueBinder|null
Methods
- __construct() : mixed
- Create a new HTML Reader instance.
- canRead() : bool
- Validate that the current file is an HTML file.
- getAllowExternalImages() : bool
- getBorderMappings() : array<string, string>
- getBorderStyle() : string|null
- Map html border style to PhpSpreadsheet border style.
- getIgnoreRowsWithNoCells() : bool
- getIncludeCharts() : bool
- Read charts in workbook? If this is true, then the Reader will include any charts that exist in the workbook.
- getLoadSheetsOnly() : null|array<string|int, string>
- Get which sheets to load Returns either an array of worksheet names (the list of worksheets that should be loaded), or a null indicating that all worksheets in the workbook should be loaded.
- getReadDataOnly() : bool
- Read data only? If this is true, then the Reader will only read data values for cells, it will not read any formatting or structural information (like merges).
- getReadEmptyCells() : bool
- Read empty cells? If this is true (the default), then the Reader will read data values for all cells, irrespective of value.
- getReadFilter() : IReadFilter
- Read filter.
- getSecurityScanner() : XmlScanner|null
- getSecurityScannerOrThrow() : XmlScanner
- getSheetIndex() : int
- Get sheet index.
- getStyleColor() : string
- Check if has #, so we can get clean hex.
- getValueBinder() : IValueBinder|null
- listWorksheetInfo() : array<int, array{worksheetName: string, lastColumnLetter: string, lastColumnIndex: int, totalRows: int, totalColumns: int, sheetState: string}>
- Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns).
- listWorksheetNames() : array<string|int, string>
- Returns names of the worksheets from a file, possibly without parsing the whole file to a Spreadsheet object.
- load() : Spreadsheet
- Loads Spreadsheet from file.
- loadFromString() : Spreadsheet
- Spreadsheet from content.
- loadIntoExisting() : Spreadsheet
- Loads PhpSpreadsheet from file into PhpSpreadsheet instance.
- loadSpreadsheetFromFile() : Spreadsheet
- Loads Spreadsheet from file.
- setAllowExternalImages() : self
- Allow external images. Use with caution.
- setCreateBlankSheetIfNoneRead() : self
- Create a blank sheet if none are read, possibly due to a typo when using LoadSheetsOnly.
- setIgnoreRowsWithNoCells() : self
- setIncludeCharts() : $this
- Set read charts in workbook Set to true, to advise the Reader to include any charts that exist in the workbook.
- setLoadAllSheets() : $this
- Set all sheets to load Tells the Reader to load all worksheets from the workbook.
- setLoadSheetsOnly() : $this
- Set which sheets to load.
- setReadDataOnly() : $this
- Set read data only Set to true, to advise the Reader only to read data values for cells, and to ignore any formatting or structural information (like merges).
- setReadEmptyCells() : $this
- Set read empty cells Set to true (the default) to advise the Reader read data values for all cells, irrespective of value.
- setReadFilter() : $this
- Set read filter.
- setSheetIndex() : $this
- Set sheet index.
- setValueBinder() : self
- flushCell() : void
- Flush cell.
- getTableStartColumn() : string
- newSpreadsheet() : Spreadsheet
- openFile() : void
- Open file for reading.
- processDomElement() : void
- processFlags() : void
- releaseTableStartColumn() : string
- replaceNonAsciiIfNeeded() : string|null
- setTableStartColumn() : string
Constants
TEST_SAMPLE_SIZE
Sample size to read to determine if it's HTML or not.
    public
        mixed
    TEST_SAMPLE_SIZE
    = 2048
    
    
    
    
FORMATS
Formats.
    protected
        mixed
    FORMATS
    = [
    'h1' => ['font' => ['bold' => true, 'size' => 24]],
    //    Bold, 24pt
    'h2' => ['font' => ['bold' => true, 'size' => 18]],
    //    Bold, 18pt
    'h3' => ['font' => ['bold' => true, 'size' => 13.5]],
    //    Bold, 13.5pt
    'h4' => ['font' => ['bold' => true, 'size' => 12]],
    //    Bold, 12pt
    'h5' => ['font' => ['bold' => true, 'size' => 10]],
    //    Bold, 10pt
    'h6' => ['font' => ['bold' => true, 'size' => 7.5]],
    //    Bold, 7.5pt
    'a' => ['font' => ['underline' => true, 'color' => ['argb' => \PhpOffice\PhpSpreadsheet\Style\Color::COLOR_BLUE]]],
    //    Blue underlined
    'hr' => ['borders' => ['bottom' => ['borderStyle' => \PhpOffice\PhpSpreadsheet\Style\Border::BORDER_THIN, 'color' => [\PhpOffice\PhpSpreadsheet\Style\Color::COLOR_BLACK]]]],
    //    Bottom border
    'strong' => ['font' => ['bold' => true]],
    //    Bold
    'b' => ['font' => ['bold' => true]],
    //    Bold
    'i' => ['font' => ['italic' => true]],
    //    Italic
    'em' => ['font' => ['italic' => true]],
]
    
    
    
    
Properties
$allowExternalImages
Allow external images. Use with caution.
        protected
            bool
    $allowExternalImages
     = false
        Improper specification of these within a spreadsheet can subject the caller to security exploits.
$createBlankSheetIfNoneRead
Create a blank sheet if none are read, possibly due to a typo when using LoadSheetsOnly.
        protected
            bool
    $createBlankSheetIfNoneRead
     = false
    
    
    
    
    
$dataArray
Data Array used for testing only, should write to Spreadsheet object on completion of tests.
        protected
            array<string|int, array<string|int, mixed>>
    $dataArray
     = []
    
    
    
    
    
$fileHandle
        protected
            resource
    $fileHandle
    
    
    
    
    
    
$ignoreRowsWithNoCells
Ignore rows with no cells? Identifies whether the Reader should ignore rows with no cells.
        protected
            bool
    $ignoreRowsWithNoCells
     = false
        Currently implemented only for Xlsx.
$includeCharts
Read charts that are defined in the workbook? Identifies whether the Reader should read the definitions for any charts that exist in the workbook;.
        protected
            bool
    $includeCharts
     = false
    
    
    
    
    
$inputEncoding
Input encoding.
        protected
            string
    $inputEncoding
     = 'ANSI'
    
    
    
    
    
$loadSheetsOnly
Restrict which sheets should be loaded? This property holds an array of worksheet names to be loaded. If null, then all worksheets will be loaded.
        protected
            null|array<string|int, string>
    $loadSheetsOnly
     = null
        This property is ignored for Csv, Html, and Slk.
$nestedColumn
        protected
            array<string|int, string>
    $nestedColumn
     = ['A']
    
    
    
    
    
$readDataOnly
Read data only? Identifies whether the Reader should only read data values for cells, and ignore any formatting information; or whether it should read both data and formatting.
        protected
            bool
    $readDataOnly
     = false
    
    
    
    
    
$readEmptyCells
Read empty cells? Identifies whether the Reader should read data values for all cells, or should ignore cells containing null value or empty string.
        protected
            bool
    $readEmptyCells
     = true
    
    
    
    
    
$readFilter
IReadFilter instance.
        protected
            IReadFilter
    $readFilter
    
    
    
    
    
    
$rowspan
        protected
            array<string, bool>
    $rowspan
     = []
    
    
    
    
    
$securityScanner
        protected
            XmlScanner|null
    $securityScanner
     = null
    
    
    
    
    
$sheetIndex
Sheet index to read.
        protected
            int
    $sheetIndex
     = 0
    
    
    
    
    
$tableLevel
        protected
            int
    $tableLevel
     = 0
    
    
    
    
    
$valueBinder
        protected
            IValueBinder|null
    $valueBinder
     = null
    
    
    
    
    
Methods
__construct()
Create a new HTML Reader instance.
    public
                    __construct() : mixed
    canRead()
Validate that the current file is an HTML file.
    public
                    canRead(string $filename) : bool
    Parameters
- $filename : string
Return values
boolgetAllowExternalImages()
    public
                    getAllowExternalImages() : bool
    Return values
boolgetBorderMappings()
    public
            static        getBorderMappings() : array<string, string>
    Return values
array<string, string>getBorderStyle()
Map html border style to PhpSpreadsheet border style.
    public
                    getBorderStyle(string $style) : string|null
    Parameters
- $style : string
Return values
string|nullgetIgnoreRowsWithNoCells()
    public
                    getIgnoreRowsWithNoCells() : bool
    Return values
boolgetIncludeCharts()
Read charts in workbook? If this is true, then the Reader will include any charts that exist in the workbook.
    public
                    getIncludeCharts() : bool
    Note that a ReadDataOnly value of false overrides, and charts won't be read regardless of the IncludeCharts value. If false (the default) it will ignore any charts defined in the workbook file.
Return values
boolgetLoadSheetsOnly()
Get which sheets to load Returns either an array of worksheet names (the list of worksheets that should be loaded), or a null indicating that all worksheets in the workbook should be loaded.
    public
                    getLoadSheetsOnly() : null|array<string|int, string>
    Return values
null|array<string|int, string>getReadDataOnly()
Read data only? If this is true, then the Reader will only read data values for cells, it will not read any formatting or structural information (like merges).
    public
                    getReadDataOnly() : bool
    If false (the default) it will read data and formatting.
Return values
boolgetReadEmptyCells()
Read empty cells? If this is true (the default), then the Reader will read data values for all cells, irrespective of value.
    public
                    getReadEmptyCells() : bool
    If false it will not read data for cells containing a null value or an empty string.
Return values
boolgetReadFilter()
Read filter.
    public
                    getReadFilter() : IReadFilter
    Return values
IReadFiltergetSecurityScanner()
    public
                    getSecurityScanner() : XmlScanner|null
    Return values
XmlScanner|nullgetSecurityScannerOrThrow()
    public
                    getSecurityScannerOrThrow() : XmlScanner
    Return values
XmlScannergetSheetIndex()
Get sheet index.
    public
                    getSheetIndex() : int
    Return values
intgetStyleColor()
Check if has #, so we can get clean hex.
    public
                    getStyleColor(string|null $value) : string
    Parameters
- $value : string|null
Return values
stringgetValueBinder()
    public
                    getValueBinder() : IValueBinder|null
    Return values
IValueBinder|nulllistWorksheetInfo()
Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns).
    public
                    listWorksheetInfo(string $filename) : array<int, array{worksheetName: string, lastColumnLetter: string, lastColumnIndex: int, totalRows: int, totalColumns: int, sheetState: string}>
    Parameters
- $filename : string
Return values
array<int, array{worksheetName: string, lastColumnLetter: string, lastColumnIndex: int, totalRows: int, totalColumns: int, sheetState: string}>listWorksheetNames()
Returns names of the worksheets from a file, possibly without parsing the whole file to a Spreadsheet object.
    public
                    listWorksheetNames(string $filename) : array<string|int, string>
    Readers will often have a more efficient method with which they can override this method.
Parameters
- $filename : string
Return values
array<string|int, string>load()
Loads Spreadsheet from file.
    public
                    load(string $filename[, int $flags = 0 ]) : Spreadsheet
    Parameters
- $filename : string
- 
                    The name of the file to load 
- $flags : int = 0
- 
                    the optional second parameter flags may be used to identify specific elements that should be loaded, but which won't be loaded by default, using these values: IReader::LOAD_WITH_CHARTS - Include any charts that are defined in the loaded file 
Return values
SpreadsheetloadFromString()
Spreadsheet from content.
    public
                    loadFromString(string $content[, Spreadsheet|null $spreadsheet = null ]) : Spreadsheet
    Parameters
- $content : string
- $spreadsheet : Spreadsheet|null = null
Return values
SpreadsheetloadIntoExisting()
Loads PhpSpreadsheet from file into PhpSpreadsheet instance.
    public
                    loadIntoExisting(string $filename, Spreadsheet $spreadsheet) : Spreadsheet
    Parameters
- $filename : string
- $spreadsheet : Spreadsheet
Return values
SpreadsheetloadSpreadsheetFromFile()
Loads Spreadsheet from file.
    public
                    loadSpreadsheetFromFile(string $filename) : Spreadsheet
    Parameters
- $filename : string
Return values
SpreadsheetsetAllowExternalImages()
Allow external images. Use with caution.
    public
                    setAllowExternalImages(bool $allowExternalImages) : self
    Improper specification of these within a spreadsheet can subject the caller to security exploits.
Parameters
- $allowExternalImages : bool
Return values
selfsetCreateBlankSheetIfNoneRead()
Create a blank sheet if none are read, possibly due to a typo when using LoadSheetsOnly.
    public
                    setCreateBlankSheetIfNoneRead(bool $createBlankSheetIfNoneRead) : self
    Parameters
- $createBlankSheetIfNoneRead : bool
Return values
selfsetIgnoreRowsWithNoCells()
    public
                    setIgnoreRowsWithNoCells(bool $ignoreRowsWithNoCells) : self
    Parameters
- $ignoreRowsWithNoCells : bool
Return values
selfsetIncludeCharts()
Set read charts in workbook Set to true, to advise the Reader to include any charts that exist in the workbook.
    public
                    setIncludeCharts(bool $includeCharts) : $this
    Note that a ReadDataOnly value of false overrides, and charts won't be read regardless of the IncludeCharts value. Set to false (the default) to discard charts.
Parameters
- $includeCharts : bool
Return values
$thissetLoadAllSheets()
Set all sheets to load Tells the Reader to load all worksheets from the workbook.
    public
                    setLoadAllSheets() : $this
    Return values
$thissetLoadSheetsOnly()
Set which sheets to load.
    public
                    setLoadSheetsOnly(null|string|array<string|int, string> $sheetList) : $this
    Parameters
- $sheetList : null|string|array<string|int, string>
Return values
$thissetReadDataOnly()
Set read data only Set to true, to advise the Reader only to read data values for cells, and to ignore any formatting or structural information (like merges).
    public
                    setReadDataOnly(bool $readCellValuesOnly) : $this
    Set to false (the default) to advise the Reader to read both data and formatting for cells.
Parameters
- $readCellValuesOnly : bool
Return values
$thissetReadEmptyCells()
Set read empty cells Set to true (the default) to advise the Reader read data values for all cells, irrespective of value.
    public
                    setReadEmptyCells(bool $readEmptyCells) : $this
    Set to false to advise the Reader to ignore cells containing a null value or an empty string.
Parameters
- $readEmptyCells : bool
Return values
$thissetReadFilter()
Set read filter.
    public
                    setReadFilter(IReadFilter $readFilter) : $this
    Parameters
- $readFilter : IReadFilter
Return values
$thissetSheetIndex()
Set sheet index.
    public
                    setSheetIndex(int $sheetIndex) : $this
    Parameters
- $sheetIndex : int
- 
                    Sheet index 
Return values
$thissetValueBinder()
    public
                    setValueBinder(IValueBinder|null $valueBinder) : self
    Parameters
- $valueBinder : IValueBinder|null
Return values
selfflushCell()
Flush cell.
    protected
                    flushCell(Worksheet $sheet, string $column, int|string $row, mixed &$cellContent, array<string|int, string> $attributeArray) : void
    Parameters
- $sheet : Worksheet
- $column : string
- $row : int|string
- $cellContent : mixed
- $attributeArray : array<string|int, string>
Tags
getTableStartColumn()
    protected
                    getTableStartColumn() : string
    Return values
stringnewSpreadsheet()
    protected
                    newSpreadsheet() : Spreadsheet
    Return values
SpreadsheetopenFile()
Open file for reading.
    protected
                    openFile(string $filename) : void
    Parameters
- $filename : string
processDomElement()
    protected
                    processDomElement(DOMNode $element, Worksheet $sheet, int &$row, string &$column, string &$cellContent) : void
    Parameters
- $element : DOMNode
- $sheet : Worksheet
- $row : int
- $column : string
- $cellContent : string
processFlags()
    protected
                    processFlags(int $flags) : void
    Parameters
- $flags : int
releaseTableStartColumn()
    protected
                    releaseTableStartColumn() : string
    Return values
stringreplaceNonAsciiIfNeeded()
    protected
            static        replaceNonAsciiIfNeeded(string $convert) : string|null
    Parameters
- $convert : string
Return values
string|nullsetTableStartColumn()
    protected
                    setTableStartColumn(string $column) : string
    Parameters
- $column : string