This article is about searching text in .HTM and .HTML files. It answers questions: what is HTM file, why should we use this file format and what is the best way to search it for text.
What is HTML file?
HyperText Markup Language (HTML). HTML is the lingua franca for publishing hypertext on the World Wide Web. It is a non-proprietary format based upon SGML, and can be created and processed by a wide range of tools, from simple plain text editors – you type it in from scratch- to sophisticated WYSIWYG authoring tools.
HTML 4.0 first released as a W3C Recommendation on 18 December 1997. This specification has now been superseded by HTML 4.0. More information about HTML on www.w3.org
Why do I need to search in HTML files?
Just because it takes too much time to open each file. But sometimes you save web pages or download web-sites and need some mean to search within file you got.
Is there a native Window tool to search in HTML files?
Unfortunately no. “Find file” in Windows cannot search text within htm files, as Windows doesn’t have any mean to convert .htm or .html file into plain text file.
As you want to have a full access to all knowledgebase you have, then you should use some special search utility with htm search support (such as FSA – you will read about it below).
What is the simplest solution to search in HTML files – it’s html search engine.
We created File Search Assistant to make your search fast and effective.
You should run File Search Assistant, put your search criteria and click “Search” button.
1. Put *.htm mask in “File Name” field. You will tell FSA to search only for htm files.
2. Use “Search in” drop list to specify location to search in.
3. Put keywords you want to find in “Keyword” field.
- Use custom search options to create custom search groups – it will make you search faster. Read more in on-line manual.
- Using custom search options allows to skip step where you specify search mask and path to search.
- You can use regular expressions as a keyword. For example +dog -cat +”my cat”
OK. Now tell me how can your FSA be more effective that other html searching tools.
Three key features makes FSA extremely useful:
Custom Search Options – you can point FSA to folders where your files are and where exactly you want to search in. Once you have created a search group you can use it for every other search.
Preview Pane – Preview pane show you the small piece of file where keyword (or keywords if you search for some regular expression) was found. Specify your search criteria and click “search” button to learn how it works.
Preview pane shows you the small piece of file where keyword (or keywords if you search for som regular expression) was found.
FSA highlights found keyword(s) so you can decide whether you need this very file. If not – click “Next File” button. If you not sure – click “Next Fragment” button to show next text fragment with found keyword(s)
Html searches – generating search report
Any time you can generate search report from search results and make use of them later.
FSA puts in search report:
- text from found files with highlighted keywords;
- information about found files – size, path and relevancy to search phrase;
- link to found file so you can open it right from search report;
Tell me more about searching in HTML files
Depends on what computer you are using now. You can download FSA and try for free.
If it will find ANY text in HTML file?
Yes. FSA also will search in image alt tags.
Can I use FSA to search in htm as a binary file?
Anytime you can turn off “HTML to txt” filter. Then FSA will read HTML as a binary file.
If my HTML file is in ZIP archive, can FSA find text in it?
Yes, you just should check “search in zip files” checkbox.
Can it find HTML which is on network disk?
Yes FSA can search files on both local and network disks.
Get File Search Assistant
You can get File Search Assistant right now. It’s a file search tool that allows to search for popular file types on local hard disk and across a network.