Apache spark read file from hadoop file system

Image for post
Image for post
Photo by Zoë on Unsplash

The default path for hadoop file system is configured at core-site.xml like

To get the file from spark, we will need to use SparkContext.

Then we can get reference to the textFile by passing hadoop path:

Get the first sentence of textFile for example

Happy coding ~~

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store