Big Data Online Practice Test - 1
This Test will cover complete Big Data with very important questions, starting off from basics to advanced level.
Q. A lead was explaining,the concept of a reducer, to his team. Which of the following,correctly describes it:
1) Shuffle and Sort are a part of reducer. They occur simultaneously
2) Just like a mapper, a reduce only job,is also possible,in any situation
3) Hadoop will allow,reduce only job,if shuffle and sort,are emitted.
4) The output of the reducer,is the final output, which is stored in HDFS.
Mark the valid option.
Q. Which of the following statements are correct, regarding the description of output format,of the Mapreduce jobs:
1) If the last output filename,of a job is test-r-00025,it means there were 26 reducers.
2) result-r-00001 means, its a reducer output.For a map only job the 'r' is replaced by 'm'.
3) By setting configuration, mapreduce.set.outputfilename,one can change the output file name.
4) If the format of output file,is part-r-yyyyy,then the yyyyy represents the,task number.
Mark the valid option.
Q. A lead was training his team,about Hive Serialization.Which of the following points,are valid or invalid:
1) SerDe is a library,which is built-in,in the Hadoop API.
2) Hive gets to know,how to process a record, through the Serializer or Deserializer.
3) To read write delimited records,such as Control A separated records, one can use SerDe library.
4) Hive uses, MetadataTypedColumnsetSerDe, class to Serialize or Deserialize data.
Mark the valid option.
Q. Below are few points about Pig framework.Mention if its an advantage(A) or disadvantage(D):
1) In Pig, commands are not executed unless they are stored or dumped.
2) Pig does not enforce an explicit schema.
3) In Pig, it is possible to control,execution,of every step.
4) In Pig, program does not get evaluated, unless it gives an output file or outputs any message.
Mark the correct option.
|
|
|
|
Q. In an interview following statements were mentioned. Mark them as True or false:
1) Hive has high latency as compared to HBase.
2) HBase uses SQL while Hive uses HQL.
3) HBase supports secondary indexes.
4) HBase supports real time processing.
Mark the proper option.
|
|
|
|
Q. A developer,working on Mongodb,was told to write the following queries:
1) Update document with _id:3 in movies collection and add name "zootopia" to animation array.
2) To find documents from movies collection where Genre is not set to 'A'.
Mark the correct option.
Q. A developer was working on Scala code snippets.He was asked to write them for following statements:
1) Iterate over each element in tuple.
2) Add the values in tuple.
Tuple given was: val ttuple= (5,2,9,1).
Mark the correct syntax.
Q. A fresher was told to write a Pig query, for a relation of nations. He was supposed to partition them,based on their GDP rank.
Which of the following query,best suits the requirement?
|
|
|
|
Q. A developer was given some scenario's,and, was required to analyse, whether to use Mapreduce or Pig in them:
1) Definite driver control program is needed.
2) Code requires less debugging.
3) Output of one job, is an input to another job.
4) Job consists usage of distributed cache, cross products and joins.
Mark the correct option about the scenario's.
|
|
|
|
Q. A developer was learning,the Cassandra data model.In the tutorial,some of the points mentioned,were as follows. Mark the ones,that are correct / incorrect:
1) While using Cassandra,for production usage, one should generally work with,Network Topology strategy.
2) There is at least one column family,per keyspace.
3) The rows in each column,are itself a collection,of many columns.
4) The Cassandra column families,can be changed,as they are not predefined.
Choose the appropriate option.
|
|
|
|