I am fetching data from a text file where I need to match the substring to get the matched line. Once, I have that, I need to get the third 8 digit value in the line which comes after the delimiter "|". Basically, all the values have varying lengths and are separated by a delimiter "|". Except the first substring (id) which is of fixed length and has a fix starting and end position.
Text file data example:
0123456|BHKAHHHHkk|12345678|JuiKKK121255
9100450|HHkk|12348888|JuiKKK10000000021sdadad255
$file = 'file.txt';
// the following line prevents the browser from parsing this as HTML.
header('Content-Type: text/plain');
// get the file contents, assuming the file to be readable (and exist)
$contents = file_get_contents($file);
// escape special characters in the query
$txt = explode("\n",$contents);
$counter = 0;
foreach($txt as $key => $line){
$subbedString = substr($line,2,6);
// $searchfor = '123456';
//echo strpos($subbedString,$searchfor);
if(strpos($subbedString,$searchfor) === 0){
$matches[$key] = $searchfor;
$matchesLine[$key] = substr($line,2,50);
echo "<p>" . $matchesLine[$key] . "</p>";
$counter = 1;
if($counter==10) break;
}
CodePudding user response:
- If you need to divide file's contents by line breaks, it's always better to use file function
- To divide line into parts with unknown length by a delimiter, use explode function.
Code:
$file = 'file.txt';
$txt = file($file);
$counter = 0;
foreach ($txt as $key => $line) {
$line = \trim($line);
$substrings = explode('|', $line);
if (\count($substrings) === 0) {
continue;
}
$searchFor = '123456';
if (substr($substrings[0], 1) === $searchFor) {
if (!isset($substrings[2]) {
continue;
}
$matches[$key] = $searchFor;
$matchesLine[$key] = $line;
echo "<p>" . $substrings[2] . "</p>";
if ( $counter === 10) {
break;
}
}
}
I also noticed that in your example there are 7-digit ids, while you were talking about 6 digits (and the $searchfor
variable didn't match anything)
CodePudding user response:
Use
^(\d )\|[^|]*\|(\d{8})\|
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\d digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\| '|'
--------------------------------------------------------------------------------
[^|]* any character except: '|' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
\| '|'
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
\d{8} digits (0-9) (8 times)
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
\| '|'
<?php
$re = '/^(\d )\|[^|]*\|(\d{8})\|/m';
$str = '0123456|BHKAHHHHkk|12345678|JuiKKK121255
9100450|HHkk|12348888|JuiKKK10000000021sdadad255';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
Sample output:
array(2) {
[0]=>
array(3) {
[0]=>
string(28) "0123456|BHKAHHHHkk|12345678|"
[1]=>
string(7) "0123456"
[2]=>
string(8) "12345678"
}
[1]=>
array(3) {
[0]=>
string(22) "9100450|HHkk|12348888|"
[1]=>
string(7) "9100450"
[2]=>
string(8) "12348888"
}
}