I have a project using preg_replace. I try to search on google: how to remove all attributes from a tag using PHP? The results are complicated. Then i fixed it myself.

What is HTML Attributes

HTML attributes provide additional information about HTML elements. The HTML standard does not require lowercase attribute names. The title attribute (and all other attributes) can be written with uppercase or lowercase like title or TITLE. The HTML standard does not require quotes around attribute values. Double quotes around attribute values are the most common in HTML, but single quotes can also be used. In some situations, when the attribute value itself contains double quotes, it is necessary to use single quotes:

HTML Attributes

  • All HTML elements can have attributes
  • Attributes provide additional information about elements
  • HTML Attributes are always specified in the start tag
  • Attributes usually come in name/value pairs like: name=”value”

Remove all attributes from all html tag

The first result show me a way to remove all attributes from any html tag. It cover all html tag like: a href, p, br, table… It mean all html tag will become basic tag.

Here is function:

preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/si",'<$1$2>', $text);

For example if you have html code like this:

<p style="padding:10px;">
<strong style="padding:10px;margin:20px;"><a href="index.php" class="atag">hello</a></strong>
</p>
$text = '<p style="padding:10px;"><strong style="padding:10px;margin:20px;">Welcome to Bien Thuy Website</strong></p>';
echo preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/si",'<$1$2>', $text);
//Output will be like this: <p><strong>Welcome to Bien Thuy Website</strong></p>

The RegExp here mean:

/              # Start Pattern
 <             # Match '<' at beginning of tags
 (             # Start Capture Group $1 - Tag Name
  [a-z]        # Match 'a' through 'z'
  [a-z0-9]*    # Match 'a' through 'z' or '0' through '9' zero or more times
 )             # End Capture Group
 [^>]*?        # Match anything other than '>', Zero or More times, not-greedy (wont eat the /)
 (\/?)         # Capture Group $2 - '/' if it is there
 >             # Match '>'
/is            # End Pattern - Case Insensitive & Multi-line ability

When using above function it will output like this:

<p>
<strong><a>hello</a></strong>
</p>

That’s not what i expected.  What i want is something to remove only  p tag or a tag or br tag.

PHP remove all attributes from a specific tag

After sometime searching and i discover the best way to do it using PHP.
Here is the code which i used:

preg_replace("~<table\s+.*?>~i",'<table>', $content);

Apply this code for whatever tag you want to remove attributes.
For example, i have table like this:

<table class="table auto Table--table ">
<thead class="Table--thead">
<tr class="Table--tr">
<th class="align-left Table--th" colspan="" rowspan="" width="">Team</th>
<th class="align-right Table--th" colspan="" rowspan="" width="">2018 income (million)</th>
<th class="align-right Table--th" colspan="" rowspan="" width="">2020 income (million)</th>
<th class="align-right Table--th" colspan="" rowspan="" width="">% Change</th>
</tr>
</thead>
<tbody class="Table--tbody">
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Man UTD</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$363</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$330</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-9.07%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Man City</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$404</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$361</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-10.55%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Arsenal</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$397</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$398</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">0.17%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Chelsea</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$500</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$486</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-2.79%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Livepool</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$566</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$547</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-3.42%</td>
</tr>
</tbody>
</table>

Then i apply above code:

$content = '<table class="table auto Table--table ">
<thead class="Table--thead">
<tr class="Table--tr">
<th class="align-left Table--th" colspan="" rowspan="" width="">Team</th>
<th class="align-right Table--th" colspan="" rowspan="" width="">2018 income (million)</th>
<th class="align-right Table--th" colspan="" rowspan="" width="">2020 income (million)</th>
<th class="align-right Table--th" colspan="" rowspan="" width="">% Change</th>
</tr>
</thead>
<tbody class="Table--tbody">
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Man UTD</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$363</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$330</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-9.07%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Man City</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$404</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$361</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-10.55%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Arsenal</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$397</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$398</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">0.17%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Chelsea</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$500</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$486</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-2.79%</td>
</tr>
<tr class="Table--tr Table--tr-visible">
<td class="align-left Table--td" colspan="" rowspan="" width="">Livepool</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$566</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">$547</td>
<td class="align-right Table--td" colspan="" rowspan="" width="">-3.42%</td>
</tr>
</tbody>
</table>';
$content = preg_replace("~<table\s+.*?>~i",'<table>', $content);
$content = preg_replace("~<thead\s+.*?>~i",'<thead>', $content);
$content = preg_replace("~<tbody\s+.*?>~i",'<tbody>', $content);
$content = preg_replace("~<tr\s+.*?>~i",'<tr>', $content);
$content = preg_replace("~<th\s+.*?>~i",'<th>', $content);
$content = preg_replace("~<td\s+.*?>~i",'<td>', $content);
echo $content;

Here is output after i used preg_replace:

<table>
<thead>
<tr>
<th>Team</th>
<th>2018 income (million)</th>
<th>2020 income (million)</th>
<th>% Change</th>
</tr>
</thead>
<tbody>
<tr>
<td>Man UTD</td>
<td>$363</td>
<td>$330</td>
<td>-9.07%</td>
</tr>
<tr>
<td>Man City</td>
<td>$404</td>
<td>$361</td>
<td>-10.55%</td>
</tr>
<tr>
<td>Arsenal</td>
<td>$397</td>
<td>$398</td>
<td>0.17%</td>
</tr>
<tr>
<td>Chelsea</td>
<td>$500</td>
<td>$486</td>
<td>-2.79%</td>
</tr>
<tr>
<td>Livepool</td>
<td>$566</td>
<td>$547</td>
<td>-3.42%</td>
</tr>
</tbody>
</table>

Well, very clean, right.

4.7/5 - (7743 bình chọn)

Để lại một bình luận

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *