hi i am fairly new to linux scripting and I need help making a command that will search through the contents of an html file(s) and use the selected content of the html to rename the folder associated with the file. Basicly what has happened is that i have downloaded content into folders and i need to pull tagging data from the webpage and append said tags to the end of the folders names to make them keyword searchable. The goal being that the folder will read as “John Doe [person1] [person2]” ect.
I have already used wget to aquire all the html files names as “index 01.html”. Within each HTMl file contains a string labeled ‘
<meta itemprop="name" content="John Doe" />’ this name is exactly the same as the current name of the folder so this can be used to locate the folder the needs to be changed. Next in the html is another string ‘
<meta name="twitter:description" content=" person1, person2, ect" />’ which contains the taggs I wish to add to the folder.
I have looked all over for how to do this and i believe it involves using grep but all of the examples that i have seen just use grep to perform search and place operations, not search for string then pipe the contents from the string to another application.
Lastly, if the command would repeat the operation for every .html file within the folder that would save me a TON of time.