Q. In an interview, developers were provided,some features,and were asked to mark,if the feature was present in Hadoop 2:
1) In Hadoop 2.0,the default Linux ports,are no longer short-lived,hence they dont fail at startup.
2) Erasure Coding technique,is used to handle fault tolerance.
3) Data balancing,is done,using the HDFS balancer.
4) Storage scheme used,is the 3x storage scheme.
Mark the correct answer.
|
|
|
|
Q. A fresher was asked,to map the different Hadoop Data formats,with their specification.
Description given was as follows:
1) This format allows,writing data to files,whose names are derived,from output keys and values or random strings.
2) Format used when output file is to be created,only when the record is emitted,for a given partition.Thus avoids empty files.
3) Key Value pairs can be of any type,they are written on individual lines of files,and,separated by a tab character.
4) Intermediate format used between MapReduce jobs,which means,temporary outputs of Maps are stored.
Data formats provided:
a) LazyOutputFormat
b) MultipleOutputs
c) SequenceFileOutputFormat
d) TextOutputFormat
Mark the correct option:
|
|
|
|
Q. Consider the following Sql query:
a) select * from airjourney order by flightprice desc;
b) select * from playcentre where swimmingpool = "present".
The corresponding query in MongoDB will be:
Q. Below are a few statements,related to the Hive table formats. Mark which statement,is related to which format - Bucketing or Partitioning:
1) If a field,has high cardinality, this format should not be used.
2) Joins at the Map side are quicker.
3) Can use this format on multiple fields.
4) This format divides,the files by ColumnName. Effective for high volume of data.
Which of the below is true?
|
|
|
|
Q. In a technical quiz competition, three developers were asked to mention,a few points regarding a Combiner in Hadoop:
Dev A- A combiner implements,the Reducer interface's reduce() method. It does have any predefined interface,of its own.Its use is optional.Map reduce will not run Combiner, when data is not needed to be spilled to the disk.
Dev B- Combiner never runs on the same input,repeatedly.MapReduce will run Combiner, when data is not needed to be spilled to the disk.
Dev C- The combiners always run,in each map reduce program,between map and reduce phase.We cannot hardcode,the number of combiners to be executed.
Mark the correct option.