How does spark partition data?

Question

Utilisateur anonyme · Accepted Answer

I can attest that this is quite true. The company develops great products but it's disappointing how they treat their candidates that they themselves contacted - it speaks highly of the organisation's culture and values.

Utilisateur anonyme · Answer

There are two main answers to this question. Initially spark partitions data via the Hadoop input format when reading from its source. Subsequently, it partitions data according to the level of parellisim such that a single task can process that data. This level of parellisim can be over partition. The level of parellisim can be overridden by some functions on a per function basis.

Databricks

Question d’entretien chez Databricks

Réponses aux questions d'entretien

Entreprises suivies

Recherche d’emplois