Based on my recent testing, it seems like it is possible to attach PDF files to notes in Joplin, but the content of those PDF files doesn't get searched when the user enters a search string. I think it would be handy for Joplin to search both the note text and the text in any attached PDF files. It would make Joplin more useful for managing academic papers and that sort of thing. Thanks for all the work you've done so far on this great app!
I here came to suggest the same thing. This is a fairly big issue for me given the fact that a large number of existing research notes I am importing from Evernote are PDFs. This feature could also be linked to the ability to natively view/preview PDFs within Joplin as per issue #877 .
I currently have lots of scanned and OCR-ed documents as PDFs in Evernote and rely on being able to search the text of them. So this feature is more important to me than preview of PDFs.
I totally agree with previous speakers.
For me, this is an essential piece for switching from Evernote, as I scan everything I receive in the mail.
The actual OCR can always be solved outside of Joplin, but searching already OCR:ed or otherwise searchable files is really needed.
I definitely see a potential complication with this though when searching on a mobile device where you might not sync all attachments.
Can there be some kind of search index, created by the desktop or terminal application, saved alongside each note perhaps?
Hey there, it looks like there has been no activity on this issue recently. Has the issue been fixed, or does it still require the community's attention? This issue may be closed if no further activity occurs. You may also label this issue as "backlog" and I will leave it open. Thank you for your contributions.
Just a small comment to prevent the bot from closing this request.
Hey there, it looks like there has been no activity on this issue recently. Has the issue been fixed, or does it still require the community's attention? This issue may be closed if no further activity occurs. You may comment on the issue and I will leave it open. Thank you for your contributions.
Don't close.
I can only agree with the previous speakers. I scan almost all documents with a Fujitsu Scansnap ix500 and it would be absolutely awesome if you could search these PDFs in Joplin.
Hello, I'm just discovering Joplin and it really looks like the Evernote replacement I am looking for. But I use EN extensively to store pdf files and searching the PDF content is THE feature that I can find only in EN at the moment. Having it in Joplin would definitely convince me to do the switch. Thanks!
Searching for the same thing and have the same use case. Lots of docs scanned to Evernote with OCR and need to be able to search them. This is preventing me from switching at present but I will keep an eye on progress. Otherwise looks great.
I also think within pdf search to be very important. Would an integration with something like rga be possible. That is a command line utility based on ripgrep which also search within files.
Hey there, it looks like there has been no activity on this issue recently. Has the issue been fixed, or does it still require the community's attention? This issue may be closed if no further activity occurs. You may comment on the issue and I will leave it open. Thank you for your contributions.
It would be great to add this feature. Any thoughts about integrating something like rga.
This would be an excellent feature. I primarily use Evernote as a scanning and searching repository so to make the full switch from EN to Joplin would require this feature.
@laurent22 Any thoughts or ideas on how to move forward with this?
Maybe there are some obvious (but less obvious to me) reasons not to use something like pdf.js (https://github.com/mozilla/pdf.js), which already seems well established for reading and extracting text from PDF files.
Something like this combined with my previous suggestion of creating a plain text file that is saved alongside each note perhaps might work?
Granted, I have no idea how Joplin actually handles search internally, but if the contents of the PDF file is available as a plain text file, it at least seems reasonable for it to be picked up by Joplin's search, while at the same time offloading any reading/extracting of PDF files to some other time than when the user actually performs the search.
This is also an essential feature for me.
I'd love to see this as well.
Most helpful comment
I totally agree with previous speakers.
For me, this is an essential piece for switching from Evernote, as I scan everything I receive in the mail.
The actual OCR can always be solved outside of Joplin, but searching already OCR:ed or otherwise searchable files is really needed.
I definitely see a potential complication with this though when searching on a mobile device where you might not sync all attachments.
Can there be some kind of search index, created by the desktop or terminal application, saved alongside each note perhaps?