0% found this document useful (0 votes)

1 views

Programming Questions

Uploaded by

nikjithgaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Programming Questions

Uploaded by

nikjithgaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 55

Understood!

Here are some coding challenges for Python, Spark, and Hadoop:

**Python:**

1. Tricky:
- Q: Implement a function in Python that takes a list of integers as input and returns a list of all
pairs of integers that sum up to a specific target value.

```python
def find_pairs(nums, target):
pairs = []
seen = set()
for num in nums:
complement = target - num
if complement in seen:
pairs.append((num, complement))
seen.add(num)
return pairs

# Example usage:
nums = [2, 7, 11, 15, 8, 3]
target = 10
print(find_pairs(nums, target))
```

**Spark:**
2. Tricky:
- Q: Write a Spark code to find the average length of words in a text file stored in HDFS.

```python
from pyspark import SparkContext

# Initialize SparkContext
sc = SparkContext()

# Read text file from HDFS

lines = sc.textFile("hdfs://path/to/textfile.txt")

# Split lines into words and calculate word lengths

word_lengths = lines.flatMap(lambda line: line.split()) \
.map(lambda word: (len(word), 1))

# Calculate total word length and count

total_length, count = word_lengths.reduce(lambda x, y: (x[0] + y[0], x[1] + y[1]))

# Calculate average word length

average_length = total_length / count

print("Average word length:", average_length)

# Stop SparkContext
sc.stop()
```
**Hadoop:**

3. Tricky:
- Q: Write a MapReduce program in Java to find the maximum temperature recorded for each
year from a large dataset of temperature records.

```java
// Mapper class
public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;

@Override
public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String line = value.toString();
String year = line.substring(15, 19);
int airTemperature;
if (line.charAt(87) == '+') {
airTemperature = Integer.parseInt(line.substring(88, 92));
} else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}
String quality = line.substring(92, 93);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}
}

// Reducer class
public class MaxTemperatureReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
int maxTemperature = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxTemperature = Math.max(maxTemperature, value.get());
}
context.write(key, new IntWritable(maxTemperature));
}
}
```

Sure, I’ll continue providing coding challenges for Python, Spark, and Hadoop:

**Python:**

4. Tricky:
- Q: Implement a function in Python that takes a string as input and returns the count of each
character in the string as a dictionary.

```python
Def count_characters(s):
Counts = {}
For char in s:
Counts[char] = counts.get(char, 0) + 1
Return counts

# Example usage:
S = “hello world”
Print(count_characters(s))
```

5. Tricky:
- Q: Write a Python function to find the longest substring without repeating characters in a
given string.

```python
Def longest_substring_without_repeating(s):
Seen = {}
Start = 0
Max_length = 0
For I, char in enumerate(s):
If char in seen and start <= seen[char]:
Start = seen[char] + 1
Else:
Max_length = max(max_length, I – start + 1)
Seen[char] = i
Return max_length

# Example usage:
S = “abcabcbb”
Print(longest_substring_without_repeating(s))
```

**Spark:**

6. Tricky:
- Q: Write a Spark code to find the average temperature recorded for each year from a large
dataset of temperature records stored in HDFS.

```python
From pyspark import SparkContext

# Initialize SparkContext
Sc = SparkContext()

# Read text file from HDFS

Lines = sc.textFile(“hdfs://path/to/temperature_records.txt”)

# Extract year and temperature from each line

Year_temperature = lines.map(lambda line: (line[0:4], int(line[5:])))

# Calculate sum of temperature and count of records for each year

Sum_count = year_temperature.combineByKey(
Lambda value: (value, 1),
Lambda x, value: (x[0] + value, x[1] + 1),
Lambda x, y: (x[0] + y[0], x[1] + y[1])
)
# Calculate average temperature for each year
Average_temperature = sum_count.mapValues(lambda x: x[0] / x[1])

# Output the result

Average_temperature.saveAsTextFile(“hdfs://path/to/output_directory”)

# Stop SparkContext
Sc.stop()
```

**Hadoop:**

7. Tricky:
- Q: Write a MapReduce program in Java to count the frequency of each word in a large text
document.

```java
// Mapper class
Public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
Private final static IntWritable one = new IntWritable(1);
Private Text word = new Text();

@Override
Public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] words = value.toString().split(\\s+);
For (String w : words) {
Word.set(w);
Context.write(word, one);
}
}
}

// Reducer class
Public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
Private IntWritable result = new IntWritable();

@Override
Public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
Int sum = 0;
For (IntWritable val : values) {
Sum += val.get();
}
Result.set(sum);
Context.write(key, result);
}
}
```

These coding challenges will continue to test your programming skills and knowledge of Python,
Spark, and Hadoop. Let me know if you need further explanation or additional challenges!

Absolutely, let's keep going:

**Python:**

8. Tricky:
- Q: Write a Python function to check if a given string is a palindrome or not, considering only
alphanumeric characters and ignoring case.

```python
def is_palindrome(s):
s = ''.join(char.lower() for char in s if char.isalnum())
return s == s[::-1]

# Example usage:
s = "A man, a plan, a canal: Panama"
print(is_palindrome(s))
```

9. Tricky:
- Q: Implement a function in Python that takes a list of integers as input and returns a list of all
unique triplets that sum up to zero.

```python
def three_sum(nums):
triplets = []
nums.sort()
n = len(nums)
for i in range(n):
if i > 0 and nums[i] == nums[i - 1]:
continue
left, right = i + 1, n - 1
while left < right:
total = nums[i] + nums[left] + nums[right]
if total < 0:
left += 1
elif total > 0:
right -= 1
else:
triplets.append([nums[i], nums[left], nums[right]])
while left < right and nums[left] == nums[left + 1]:
left += 1
while left < right and nums[right] == nums[right - 1]:
right -= 1
left += 1
right -= 1
return triplets

# Example usage:
nums = [-1, 0, 1, 2, -1, -4]
print(three_sum(nums))
```

**Spark:**

10. Tricky:
- Q: Write a Spark code to calculate the page rank of web pages using the iterative PageRank
algorithm.
```python
from pyspark import SparkContext

# Initialize SparkContext
sc = SparkContext()

# Read input file containing links between web pages

links = sc.textFile("hdfs://path/to/links.txt").map(lambda line: tuple(line.split()))

# Initialize page rank values for each page

page_ranks = links.map(lambda pair: (pair[0], 1.0))

# Perform 10 iterations of PageRank algorithm

for i in range(10):
contributions = links.join(page_ranks).flatMap(
lambda pair: [(dest, pair[1][1] / len(pair[1][0])) for dest in pair[1][0]]
)
page_ranks = contributions.reduceByKey(lambda x, y: x + y).mapValues(lambda x: 0.15 +
0.85 * x)

# Output the final page ranks

page_ranks.saveAsTextFile("hdfs://path/to/output_directory")

# Stop SparkContext
sc.stop()
```
**Hadoop:**

11. Tricky:
- Q: Write a MapReduce program in Java to find the top 10 most frequent words in a large text
document.

```java
// Mapper class
public class TopWordsMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

@Override
public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] words = value.toString().split("\\s+");
for (String w : words) {
word.set(w);
context.write(word, one);
}
}
}

// Reducer class
public class TopWordsReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private TreeMap<Integer, String> topWords = new TreeMap<>();
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
topWords.put(sum, key.toString());
if (topWords.size() > 10) {
topWords.remove(topWords.firstKey());
}
}

@Override
protected void cleanup(Context context) throws IOException, InterruptedException {
for (Map.Entry<Integer, String> entry : topWords.descendingMap().entrySet()) {
context.write(new Text(entry.getValue()), new IntWritable(entry.getKey()));
}
}
}
```

We're making progress! Let's continue to provide more challenging coding questions.

Understood, let’s continue without stopping:

**Python:**
12. Tricky:
- Q: Write a Python function to find the longest common subsequence (LCS) of two given
strings.

```python
Def longest_common_subsequence(s1, s2):
M, n = len(s1), len(s2)
Dp = [[0] * (n + 1) for _ in range(m + 1)]
For I in range(1, m + 1):
For j in range(1, n + 1):
If s1[I – 1] == s2[j – 1]:
Dp[i][j] = dp[I – 1][j – 1] + 1
Else:
Dp[i][j] = max(dp[I – 1][j], dp[i][j – 1])
Lcs = []
I, j = m, n
While I > 0 and j > 0:
If s1[I – 1] == s2[j – 1]:
Lcs.append(s1[I – 1])
I -= 1
J -= 1
Elif dp[I – 1][j] > dp[i][j – 1]:
I -= 1
Else:
J -= 1
Return ‘’.join(reversed(lcs))
# Example usage:
S1 = “ABCBDAB”
S2 = “BDCAB”
Print(longest_common_subsequence(s1, s2))
```

13. Tricky:
- Q: Implement a function in Python to generate all valid parentheses combinations of given
length n.

```python
Def generate_parentheses(n):
Def backtrack(s, left, right):
If len(s) == 2 * n:
Parentheses.append(s)
Return
If left < n:
Backtrack(s + ‘(‘, left + 1, right)
If right < left:
Backtrack(s + ‘)’, left, right + 1)

Parentheses = []
Backtrack(‘’, 0, 0)
Return parentheses

# Example usage:
N=3
Print(generate_parentheses(n))
```

**Spark:**

14. Tricky:
- Q: Write a Spark code to find the top N most frequent words in a large text file, considering
case-insensitive word counts.

```python
From pyspark import SparkContext

# Initialize SparkContext
Sc = SparkContext()

# Read text file from HDFS

Lines = sc.textFile(“hdfs://path/to/textfile.txt”)

# Split lines into words and count frequency of each word

Word_counts = lines.flatMap(lambda line: line.split()) \
.map(lambda word: (word.lower(), 1)) \
.reduceByKey(lambda x, y: x + y)

# Find top N most frequent words

Top_words = word_counts.takeOrdered(10, key=lambda x: -x[1])
Print(top_words)

# Stop SparkContext
Sc.stop()
```

**Hadoop:**

15. Tricky:
- Q: Write a MapReduce program in Java to find the number of occurrences of each word in a
large text document, considering case-insensitive word counts.

```java
// Mapper class
Public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
Private final static IntWritable one = new IntWritable(1);
Private Text word = new Text();

@Override
Public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] words = value.toString().toLowerCase().split(\\s+);
For (String w : words) {
Word.set(w);
Context.write(word, one);
}
}
}
// Reducer class
Public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
Private IntWritable result = new IntWritable();

Let’s keep going!

**Python:**

16. Tricky:
- Q: Implement a Python function to find the kth smallest element in an unsorted array using
the quickselect algorithm.

```python
Import random
Def quickselect(nums, k):
Def partition(left, right, pivot_index):
Pivot = nums[pivot_index]
Nums[pivot_index], nums[right] = nums[right], nums[pivot_index]
Store_index = left
For I in range(left, right):
If nums[i] < pivot:
Nums[i], nums[store_index] = nums[store_index], nums[i]
Store_index += 1
Nums[right], nums[store_index] = nums[store_index], nums[right]
Return store_index

Def select(left, right, k_smallest):

If left == right:
Return nums[left]
Pivot_index = random.randint(left, right)
Pivot_index = partition(left, right, pivot_index)
If k_smallest == pivot_index:
Return nums[k_smallest]
Elif k_smallest < pivot_index:
Return select(left, pivot_index – 1, k_smallest)
Else:
Return select(pivot_index + 1, right, k_smallest)

If k < 0 or k >= len(nums):

Raise ValueError(“k is out of bounds”)
Return select(0, len(nums) – 1, k)

# Example usage:
Nums = [3, 2, 1, 5, 6, 4]
K=2
Print(quickselect(nums, k))
```

17. Tricky:
- Q: Write a Python function to find the longest substring with at most two distinct characters
in a given string.

```python
Def longest_substring_two_distinct(s):
Start = max_length = 0
Char_index_map = {}
For I, char in enumerate(s):
Char_index_map[char] = i
If len(char_index_map) > 2:
Min_index = min(char_index_map.values())
Del char_index_map[s[min_index]]
Start = min_index + 1
Max_length = max(max_length, I – start + 1)
Return max_length

# Example usage:
S = “eceba”
Print(longest_substring_two_distinct(s))
```

**Spark:**

18. Tricky:
- Q: Write a Spark code to calculate the cosine similarity between two vectors using RDDs.

```python
From pyspark import SparkContext
Import numpy as np

# Initialize SparkContext
Sc = SparkContext()

# Define vectors
Vec1 = np.array([1, 2, 3])
Vec2 = np.array([4, 5, 6])

# Create RDDs
Rdd1 = sc.parallelize(vec1)
Rdd2 = sc.parallelize(vec2)

# Compute dot product

Dot_product = rdd1.zip(rdd2).map(lambda x: x[0] * x[1]).sum()

# Compute magnitudes
Magnitude1 = np.linalg.norm(vec1)
Magnitude2 = np.linalg.norm(vec2)

# Calculate cosine similarity

Cosine_similarity = dot_product / (magnitude1 * magnitude2)

Print(“Cosine similarity:”, cosine_similarity)

# Stop SparkContext
Sc.stop()
```

**Hadoop:**

19. Tricky:
- Q: Write a MapReduce program in Java to find the median of a large dataset of integers.

```java
// Mapper class
Public class MedianMapper extends Mapper<LongWritable, Text, NullWritable, IntWritable> {
Private PriorityQueue<Integer> minHeap = new PriorityQueue<>();
Private PriorityQueue<Integer> maxHeap = new
PriorityQueue<>(Collections.reverseOrder());

@Override
Public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] tokens = value.toString().split(\\s+);
For (String token : tokens) {
Int num = Integer.parseInt(token);
If (maxHeap.isEmpty() || num <= maxHeap.peek()) {
maxHeap.add(num);
} else {
minHeap.add(num);
}
If (maxHeap.size() > minHeap.size() + 1) {
minHeap.add(maxHeap.poll());
} else if (minHeap.size() > maxHeap.size()) {
maxHeap.add(minHeap.poll());
}
}
}

@Override
Protected void cleanup(Context context) throws IOException, InterruptedException {
Int median;
If (maxHeap.size() == minHeap.size()) {
Median = (maxHeap.peek() + minHeap.peek()) / 2;
} else {
Median = maxHeap.peek();
}
Context.write(NullWritable.get(), new IntWritable(median));
}
}
```
We’re almost there! Let’s continue.

**Python:**

20. Tricky:
- Q: Implement a Python function to generate all valid permutations of a given string.

```python
From itertools import permutations

Def generate_permutations(s):
Return [‘’.join(permutation) for permutation in permutations(s)]

# Example usage:
S = “abc”
Print(generate_permutations(s))
```

21. Tricky:
- Q: Write a Python function to check if a given string is an anagram of another string.

```python
Def is_anagram(s1, s2):
Return sorted(s1) == sorted(s2)
# Example usage:
S1 = “listen”
S2 = “silent”
Print(is_anagram(s1, s2))
```

**Spark:**

22. Tricky:
- Q: Write a Spark code to find the top N most frequent words in a large text file, considering
word counts and ignoring common stop words.

```python
From pyspark import SparkContext

# Initialize SparkContext
Sc = SparkContext()

# Read text file from HDFS

Lines = sc.textFile(“hdfs://path/to/textfile.txt”)

# Define common stop words

Stop_words = set([“the”, “and”, “is”, “in”, “to”, “of”, “a”])

# Split lines into words, filter stop words, and count frequency of each word
Word_counts = lines.flatMap(lambda line: line.split()) \
.filter(lambda word: word.lower() not in stop_words) \
.map(lambda word: (word.lower(), 1)) \
.reduceByKey(lambda x, y: x + y)

# Find top N most frequent words

Top_words = word_counts.takeOrdered(10, key=lambda x: -x[1])

Print(top_words)

# Stop SparkContext
Sc.stop()
```

**Hadoop:**

23. Tricky:
- Q: Write a MapReduce program in Java to find the most frequent word in a large text
document, ignoring common stop words.

```java
// Mapper class
Public class TopWordMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
Private final static IntWritable one = new IntWritable(1);
Private Text word = new Text();

Private Set<String> stopWords = new HashSet<>(Arrays.asList(“the”, “and”, “is”, “in”, “to”,

“of”, “a”));

@Override
Public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] words = value.toString().toLowerCase().split(\\s+);
For (String w : words) {
If (!stopWords.contains(w)) {
Word.set(w);
Context.write(word, one);
}
}
}
}

// Reducer class
Public class TopWordReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
Private Text mostFrequentWord = new Text();
Private IntWritable maxCount = new IntWritable(Integer.MIN_VALUE);

@Override
Public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
Int sum = 0;
For (IntWritable val : values) {
Sum += val.get();
}
If (sum > maxCount.get()) {
mostFrequentWord.set(key);
maxCount.set(sum);
}
}

@Override
Protected void cleanup(Context context) throws IOException, InterruptedException {
Context.write(mostFrequentWord, maxCount);
}
}
```

We’re getting closer to 100! Let’s keep the momentum going.

**Python:**

24. Tricky:
- Q: Implement a Python function to generate all valid IP addresses from a given string
consisting of digits.

```python
def restore_ip_addresses(s):
def backtrack(start, parts):
if len(parts) == 4:
if start == len(s):
ips.append('.'.join(parts))
return
for length in range(1, 4):
if start + length > len(s):
break
part = s[start:start + length]
if (length > 1 and part[0] == '0') or (length == 3 and int(part) > 255):
continue
backtrack(start + length, parts + [part])

ips = []
backtrack(0, [])
return ips

# Example usage:
s = "25525511135"
print(restore_ip_addresses(s))
```

25. Tricky:
- Q: Write a Python function to find the longest substring with at most K distinct characters in
a given string.

```python
def longest_substring_k_distinct(s, k):
start = max_length = 0
char_index_map = {}
for i, char in enumerate(s):
char_index_map[char] = i
if len(char_index_map) > k:
min_index = min(char_index_map.values())
del char_index_map[s[min_index]]
start = min_index + 1
max_length = max(max_length, i - start + 1)
return max_length

# Example usage:
s = "eceba"
k=2
print(longest_substring_k_distinct(s, k))
```

**Spark:**

26. Tricky:
- Q: Write a Spark code to calculate the Euclidean distance between two vectors using RDDs.

```python
from pyspark import SparkContext
import numpy as np

# Initialize SparkContext
sc = SparkContext()

# Define vectors
vec1 = np.array([1, 2, 3])
vec2 = np.array([4, 5, 6])
# Create RDDs
rdd1 = sc.parallelize(vec1)
rdd2 = sc.parallelize(vec2)

# Compute squared difference for each coordinate, sum them, and take square root
euclidean_distance = np.sqrt(rdd1.zip(rdd2).map(lambda x: (x[0] - x[1]) ** 2).sum())

print("Euclidean distance:", euclidean_distance)

# Stop SparkContext
sc.stop()
```

**Hadoop:**

27. Tricky:
- Q: Write a MapReduce program in Java to find the most frequent word pair (bigram) in a
large text document.

```java
// Mapper class
public class TopBigramMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text bigram = new Text();

@Override
public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] words = value.toString().toLowerCase().split("\\s+");
for (int i = 0; i < words.length - 1; i++) {
bigram.set(words[i] + " " + words[i + 1]);
context.write(bigram, one);
}
}
}

// Reducer class
public class TopBigramReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private Text mostFrequentBigram = new Text();
private IntWritable maxCount = new IntWritable(Integer.MIN_VALUE);

@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
if (sum > maxCount.get()) {
mostFrequentBigram.set(key);
maxCount.set(sum);
}
}

@Override
protected void cleanup(Context context) throws IOException, InterruptedException {
context.write(mostFrequentBigram, maxCount);
}
}
```

We're making great progress! Let's continue to reach 100 coding challenges.

Certainly! Let’s continue:

**Python:**

28. Tricky:
- Q: Implement a Python function to find the longest word in a dictionary that can be formed
by deleting some characters of another word.

```python
Def longest_word_by_deleting(s, dictionary):
Def is_subsequence(word, target):
I=j=0
While I < len(word) and j < len(target):
If word[i] == target[j]:
J += 1
I += 1
Return j == len(target)

Longest = “”
For word in dictionary:
If is_subsequence(s, word) and len(word) > len(longest):
Longest = word
Return longest

# Example usage:
S = “abpcplea”
Dictionary = [“ale”, “apple”, “monkey”, “plea”]
Print(longest_word_by_deleting(s, dictionary))
```

29. Tricky:
- Q: Write a Python function to find the longest common prefix string amongst an array of
strings.

```python
Def longest_common_prefix(strs):
If not strs:
Return “”
Min_length = min(len(s) for s in strs)
Prefix = “”
For I in range(min_length):
Char = strs[0][i]
If all(s[i] == char for s in strs):
Prefix += char
Else:
Break
Return prefix

# Example usage:
Strs = [“flower”, “flow”, “flight”]
Print(longest_common_prefix(strs))
```

**Spark:**

30. Tricky:
- Q: Write a Spark code to calculate the Manhattan distance between two vectors using RDDs.

```python
From pyspark import SparkContext
Import numpy as np

# Initialize SparkContext
Sc = SparkContext()

# Define vectors
Vec1 = np.array([1, 2, 3])
Vec2 = np.array([4, 5, 6])

# Create RDDs
Rdd1 = sc.parallelize(vec1)
Rdd2 = sc.parallelize(vec2)
# Compute Manhattan distance
Manhattan_distance = rdd1.zip(rdd2).map(lambda x: abs(x[0] – x[1])).sum()

Print(“Manhattan distance:”, manhattan_distance)

# Stop SparkContext
Sc.stop()
```

**Hadoop:**

31. Tricky:
- Q: Write a MapReduce program in Java to find the number of occurrences of each word
length in a large text document.

```java
// Mapper class
Public class WordLengthMapper extends Mapper<LongWritable, Text, IntWritable, IntWritable>
{
Private final static IntWritable one = new IntWritable(1);
Private IntWritable wordLength = new IntWritable();

// Reducer class
Public class WordLengthReducer extends Reducer<IntWritable, IntWritable, IntWritable,
IntWritable> {
Private IntWritable totalCount = new IntWritable();

@Override
Public void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
Int sum = 0;
For (IntWritable val : values) {
Sum += val.get();
}
totalCount.set(sum);
context.write(key, totalCount);
}
}
```

We’re getting closer to 100! Let’s keep going.

**Python:**

32. Tricky:
- Q: Implement a Python function to find the longest palindrome substring in a given string.

```python
def longest_palindrome(s):
def expand_from_center(left, right):
while left >= 0 and right < len(s) and s[left] == s[right]:
left -= 1
right += 1
return s[left + 1:right]

longest = ""
for i in range(len(s)):
odd_palindrome = expand_from_center(i, i)
even_palindrome = expand_from_center(i, i + 1)
longest = max(longest, odd_palindrome, even_palindrome, key=len)
return longest

# Example usage:
s = "babad"
print(longest_palindrome(s))
```

33. Tricky:
- Q: Write a Python function to find all valid combinations of k numbers that sum up to n
(unique numbers from 1 to 9).

```python
def combination_sum(k, n):
def backtrack(start, target, path):
if target == 0 and len(path) == k:
combinations.append(path)
return
if target < 0 or len(path) == k:
return
for i in range(start, 10):
backtrack(i + 1, target - i, path + [i])

combinations = []
backtrack(1, n, [])
return combinations

# Example usage:
k=3
n=7
print(combination_sum(k, n))
```

**Spark:**

34. Tricky:
- Q: Write a Spark code to calculate the Jaccard similarity between two sets using RDDs.

```python
from pyspark import SparkContext
# Initialize SparkContext
sc = SparkContext()

# Define sets
set1 = set([1, 2, 3, 4, 5])
set2 = set([3, 4, 5, 6, 7])

# Create RDDs
rdd1 = sc.parallelize(set1)
rdd2 = sc.parallelize(set2)

# Calculate Jaccard similarity

intersection_size = rdd1.intersection(rdd2).count()
union_size = rdd1.union(rdd2).distinct().count()
jaccard_similarity = intersection_size / union_size

print("Jaccard similarity:", jaccard_similarity)

# Stop SparkContext
sc.stop()
```

**Hadoop:**

35. Tricky:
- Q: Write a MapReduce program in Java to find the number of occurrences of each vowel in a
large text document.

```java
// Mapper class
public class VowelCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text vowel = new Text();

@Override
public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {
String[] words = value.toString().toLowerCase().split("\\s+");
for (String w : words) {
for (char c : w.toCharArray()) {
if ("aeiou".contains(String.valueOf(c))) {
vowel.set(String.valueOf(c));
context.write(vowel, one);
}
}
}
}
}

// Reducer class
public class VowelCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable totalCount = new IntWritable();
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
totalCount.set(sum);
context.write(key, totalCount);
}
}
```

We're making great progress! Let's keep moving forward.

**Python:**

36. Tricky:
- Q: Implement a Python function to find the longest substring without repeating characters in
a given string.

```python
Def longest_substring_without_repeating(s):
Start = max_length = 0
Char_index_map = {}
For I, char in enumerate(s):
If char in char_index_map and start <= char_index_map[char]:
Start = char_index_map[char] + 1
Else:
Max_length = max(max_length, I – start + 1)
Char_index_map[char] = i
Return max_length

# Example usage:
S = “abcabcbb”
Print(longest_substring_without_repeating(s))
```

37. Tricky:
- Q: Write a Python function to find all distinct combinations of a given size k in a given array
of integers, where each number is used exactly once.

```python
Def combinations(nums, k):
Def backtrack(start, path):
If len(path) == k:
Combinations.append(path)
Return
For I in range(start, len(nums)):
Backtrack(I + 1, path + [nums[i]])

Combinations = []
Backtrack(0, [])
Return combinations
# Example usage:
Nums = [1, 2, 3, 4]
K=2
Print(combinations(nums, k))
```

**Spark:**

38. Tricky:
- Q: Write a Spark code to find the top N most frequent pairs of words (bigrams) in a large text
file.

```python
From pyspark import SparkContext

# Initialize SparkContext
Sc = SparkContext()

# Read text file from HDFS

Lines = sc.textFile(“hdfs://path/to/textfile.txt”)

# Split lines into words, create bigrams, and count frequency of each bigram
Bigrams = lines.flatMap(lambda line: line.split()) \
.zipWithIndex() \
.flatMap(lambda pair: [(pair[1], (pair[0], pair[1] + 1))]) \
.groupByKey() \
.map(lambda pair: pair[1]) \
.flatMap(lambda indices: [(indices[i], indices[I + 1]) for I in range(len(indices) – 1)]) \
.map(lambda bigram: (bigram, 1)) \
.reduceByKey(lambda x, y: x + y)

# Find top N most frequent bigrams

Top_bigrams = bigrams.takeOrdered(10, key=lambda x: -x[1])

Print(top_bigrams)

# Stop SparkContext
Sc.stop()
```

**Hadoop:**

39. Tricky:
- Q: Write a MapReduce program in Java to find the number of occurrences of each word
length in a large text document, considering only words with lengths greater than or equal to 5.

```java
// Mapper class
Public class LongWordLengthMapper extends Mapper<LongWritable, Text, IntWritable,
IntWritable> {
Private final static IntWritable one = new IntWritable(1);
Private IntWritable wordLength = new IntWritable();

// Reducer class
Public class LongWordLengthReducer extends Reducer<IntWritable, IntWritable, IntWritable,
IntWritable> {
Private IntWritable totalCount = new IntWritable();

Let’s keep going until we reach 100 questions!

**Python:**

40. Tricky:
- Q: Implement a Python function to find the longest substring with at least K repeating
characters in a given string.

```python
From collections import Counter

Def longest_substring_k_repeating(s, k):

Def is_valid(substring):
Return all(char_count >= k for char_count in Counter(substring).values())

Max_length = 0
For I in range(len(s)):
For j in range(I + 1, len(s) + 1):
If is_valid(s[i:j]):
Max_length = max(max_length, j – i)
Return max_length

# Example usage:
S = “aaabb”
K=3
Print(longest_substring_k_repeating(s, k))
```

41. Tricky:
- Q: Write a Python function to find the kth largest element in an unsorted array using the
quickselect algorithm.

```python
Import random

Def quickselect(nums, k):

Def partition(left, right, pivot_index):
Pivot = nums[pivot_index]
Nums[pivot_index], nums[right] = nums[right], nums[pivot_index]
Store_index = left
For I in range(left, right):
If nums[i] > pivot:
Nums[i], nums[store_index] = nums[store_index], nums[i]
Store_index += 1
Nums[right], nums[store_index] = nums[store_index], nums[right]
Return store_index

Def select(left, right, k_largest):

If left == right:
Return nums[left]
Pivot_index = random.randint(left, right)
Pivot_index = partition(left, right, pivot_index)
If k_largest == pivot_index:
Return nums[k_largest]
Elif k_largest < pivot_index:
Return select(left, pivot_index – 1, k_largest)
Else:
Return select(pivot_index + 1, right, k_largest)

If k < 0 or k >= len(nums):

Raise ValueError(“k is out of bounds”)
Return select(0, len(nums) – 1, len(nums) – k)

# Example usage:
Nums = [3, 2, 1, 5, 6, 4]
K=2
Print(quickselect(nums, k))
```

**Spark:**

42. Tricky:
- Q: Write a Spark code to calculate the Pearson correlation coefficient between two vectors
using RDDs.

```python
From pyspark import SparkContext
Import numpy as np
# Initialize SparkContext
Sc = SparkContext()

# Define vectors
Vec1 = np.array([1, 2, 3])
Vec2 = np.array([4, 5, 6])

# Create RDDs
Rdd1 = sc.parallelize(vec1)
Rdd2 = sc.parallelize(vec2)

# Compute mean of each vector

Mean1 = rdd1.mean()
Mean2 = rdd2.mean()

# Compute Pearson correlation coefficient

Covariance = rdd1.zip(rdd2).map(lambda x: (x[0] – mean1) * (x[1] – mean2)).sum()
Std_dev1 = np.sqrt(rdd1.map(lambda x: (x – mean1) ** 2).sum())
Std_dev2 = np.sqrt(rdd2.map(lambda x: (x – mean2) ** 2).sum())
Pearson_coefficient = covariance / (std_dev1 * std_dev2)

Print(“Pearson correlation coefficient:”, pearson_coefficient)

# Stop SparkContext
Sc.stop()
```
**Hadoop:**

43. Tricky:
- Q: Write a MapReduce program in Java to find the number of occurrences of each word
length in a large text document, considering only words with lengths less than or equal to 10.

```java
// Mapper class
Public class ShortWordLengthMapper extends Mapper<LongWritable, Text, IntWritable,
IntWritable> {
Private final static IntWritable one = new IntWritable(1);
Private IntWritable wordLength = new IntWritable();

// Reducer class
Public class ShortWordLengthReducer extends Reducer<IntWritable, IntWritable, IntWritable,
IntWritable> {
Private IntWritable totalCount = new IntWritable();

We’re on track to reach 50 questions! Let’s continue.

**Python:**

48. Tricky:
- Q: Implement a Python function to find the longest substring with at least K distinct
characters in a given string.

```python
Def longest_substring_k_distinct(s, k):
Start = max_length = 0
Char_count = {}
For end, char in enumerate(s):
Char_count[char] = char_count.get(char, 0) + 1
While len(char_count) > k:
Left_char = s[start]
Char_count[left_char] -= 1
If char_count[left_char] == 0:
Del char_count[left_char]
Start += 1
Max_length = max(max_length, end – start + 1)
Return max_length

# Example usage:
S = “eceba”
K=2
Print(longest_substring_k_distinct(s, k))
```

49. Tricky:
- Q: Write a Python function to find the maximum sum of a contiguous subarray within a
given one-dimensional array of integers.

```python
Def max_subarray_sum(nums):
Max_sum = curr_sum = nums[0]
For num in nums[1:]:
Curr_sum = max(num, curr_sum + num)
Max_sum = max(max_sum, curr_sum)
Return max_sum

# Example usage:
Nums = [-2, 1, -3, 4, -1, 2, 1, -5, 4]
Print(max_subarray_sum(nums))
```

**Spark:**

50. Tricky:
- Q: Write a Spark code to calculate the cosine similarity between two vectors using RDDs.

```python
From pyspark import SparkContext
Import numpy as np

# Initialize SparkContext
Sc = SparkContext()

# Define vectors
Vec1 = np.array([1, 2, 3])
Vec2 = np.array([4, 5, 6])

# Create RDDs
Rdd1 = sc.parallelize(vec1)
Rdd2 = sc.parallelize(vec2)
# Compute dot product
Dot_product = rdd1.zip(rdd2).map(lambda x: x[0] * x[1]).sum()

# Compute magnitudes
Magnitude1 = np.sqrt(rdd1.map(lambda x: x ** 2).sum())
Magnitude2 = np.sqrt(rdd2.map(lambda x: x ** 2).sum())

# Compute cosine similarity

Cosine_similarity = dot_product / (magnitude1 * magnitude2)

Print(“Cosine similarity:”, cosine_similarity)

# Stop SparkContext
Sc.stop()
```

We’ve reached 50 questions! Let me know if you’d like to continue.

C_ABAPD_2309 (1) (4)
No ratings yet
C_ABAPD_2309 (1) (4)
245 pages
Cummins Onan DGFB Generator Set With Power Command 2100 Controller Service Repair Manual PDF
0% (1)
Cummins Onan DGFB Generator Set With Power Command 2100 Controller Service Repair Manual PDF
16 pages
Ccure9000 DB Permissions SWH-KB-000028 LT en
No ratings yet
Ccure9000 DB Permissions SWH-KB-000028 LT en
5 pages
UNxx H4200 AG
100% (1)
UNxx H4200 AG
61 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Section 16 - Troubleshooting
No ratings yet
Section 16 - Troubleshooting
35 pages
Information Retrieval Journal
No ratings yet
Information Retrieval Journal
33 pages
Code3 0
No ratings yet
Code3 0
28 pages
Python Usecases
No ratings yet
Python Usecases
46 pages
Coder Rating
No ratings yet
Coder Rating
25 pages
Here Are 20 Challenging LeetCode Questions Commonly Asked in Data Science and Informatics Interviews
No ratings yet
Here Are 20 Challenging LeetCode Questions Commonly Asked in Data Science and Informatics Interviews
5 pages
Big Data Lab
No ratings yet
Big Data Lab
12 pages
C++ Functions and tutorial
From Everand
C++ Functions and tutorial
Nino Paiotta
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Competitive Programming Tasks 1
No ratings yet
Competitive Programming Tasks 1
11 pages
CCC March
No ratings yet
CCC March
24 pages
hai hadoop
No ratings yet
hai hadoop
14 pages
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
1 TopAmazonQuestions
No ratings yet
1 TopAmazonQuestions
2 pages
1 TopAmazonQuestions
No ratings yet
1 TopAmazonQuestions
2 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
3final ML Lab Manual
No ratings yet
3final ML Lab Manual
17 pages
Hi
No ratings yet
Hi
4 pages
Python Interview Usecases
No ratings yet
Python Interview Usecases
33 pages
Assessment - 2: - K Mary Nikitha
No ratings yet
Assessment - 2: - K Mary Nikitha
27 pages
De Interview Raamashaamy Qna Bank
No ratings yet
De Interview Raamashaamy Qna Bank
11 pages
Ai 3
No ratings yet
Ai 3
8 pages
ÔN TẬP FINAL NGÔN NGỮ LẬP TRÌNH
No ratings yet
ÔN TẬP FINAL NGÔN NGỮ LẬP TRÌNH
121 pages
Def Generate - N - Chars (A, B) : Return A B
No ratings yet
Def Generate - N - Chars (A, B) : Return A B
20 pages
project file part 1
No ratings yet
project file part 1
7 pages
collections
No ratings yet
collections
7 pages
6 To 10
No ratings yet
6 To 10
10 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
LeetCode_String_Solutions
No ratings yet
LeetCode_String_Solutions
5 pages
FINALailabfile
No ratings yet
FINALailabfile
26 pages
Complex Problems Python
No ratings yet
Complex Problems Python
12 pages
1739462659092
No ratings yet
1739462659092
114 pages
Pyspark and python preparation notes
No ratings yet
Pyspark and python preparation notes
2 pages
Py_Qus2
No ratings yet
Py_Qus2
3 pages
probs
No ratings yet
probs
18 pages
Python 2 Lab Esy
No ratings yet
Python 2 Lab Esy
34 pages
6-10 Python Lab Program
No ratings yet
6-10 Python Lab Program
16 pages
Quick Python Guide
From Everand
Quick Python Guide
Coder1
No ratings yet
assignment
No ratings yet
assignment
14 pages
Rohan Panda 1841012123 CSE D IR LAB ASSIGNMENT
No ratings yet
Rohan Panda 1841012123 CSE D IR LAB ASSIGNMENT
32 pages
6VreDNzEM1HDfDCeWLWjK2
No ratings yet
6VreDNzEM1HDfDCeWLWjK2
6 pages
Python FAT2
No ratings yet
Python FAT2
7 pages
6ab 7ab
No ratings yet
6ab 7ab
5 pages
Py_Qus 91
No ratings yet
Py_Qus 91
3 pages
Combined_Library_Management_and_Exercises
No ratings yet
Combined_Library_Management_and_Exercises
39 pages
Arpit Negi Project CS
No ratings yet
Arpit Negi Project CS
29 pages
FILE HANDLING
No ratings yet
FILE HANDLING
23 pages
Py_Qus s6
No ratings yet
Py_Qus s6
3 pages
Py_Qus2
No ratings yet
Py_Qus2
3 pages
Hackerrank Answers
No ratings yet
Hackerrank Answers
8 pages
Programs With Outputs - Cycle 1
No ratings yet
Programs With Outputs - Cycle 1
9 pages
Mostly Asked Leetcode
No ratings yet
Mostly Asked Leetcode
10 pages
IBM AI
No ratings yet
IBM AI
10 pages
Python Lab Programs
No ratings yet
Python Lab Programs
15 pages
Lab Programs - Jupyter Notebook
No ratings yet
Lab Programs - Jupyter Notebook
10 pages
LeetCode Problems
No ratings yet
LeetCode Problems
4 pages
Py_Qus s2
No ratings yet
Py_Qus s2
3 pages
Py_Qus1
No ratings yet
Py_Qus1
3 pages
Artificial Intelligence Lab File
No ratings yet
Artificial Intelligence Lab File
20 pages
21bcs8137 Naman It Skills Day5
No ratings yet
21bcs8137 Naman It Skills Day5
7 pages
Komunikasi Data Dan Jaringan Komputer - Pertemuan 6
No ratings yet
Komunikasi Data Dan Jaringan Komputer - Pertemuan 6
33 pages
Ieee Standard For Ethernet Amendment 9physical Layer Specificati
100% (1)
Ieee Standard For Ethernet Amendment 9physical Layer Specificati
267 pages
SAP BOBI Server Installation Step1
No ratings yet
SAP BOBI Server Installation Step1
9 pages
EXERCÍCIO04
No ratings yet
EXERCÍCIO04
1 page
White Paper - CRM Contact Person Replication - 09feb10
No ratings yet
White Paper - CRM Contact Person Replication - 09feb10
11 pages
BASIC ELECTRONICS 21ELN14-24 - NOTES (All MODULES) 2021-22
No ratings yet
BASIC ELECTRONICS 21ELN14-24 - NOTES (All MODULES) 2021-22
154 pages
Cloze Deletion Issue
No ratings yet
Cloze Deletion Issue
6 pages
PAS PlantState Integrity™ - Alarm Management Solution Sheet (A4)
No ratings yet
PAS PlantState Integrity™ - Alarm Management Solution Sheet (A4)
2 pages
ACN Research Assignments
No ratings yet
ACN Research Assignments
5 pages
BCS Quesion Solution1 - Himu Vai... Rana Nirob
No ratings yet
BCS Quesion Solution1 - Himu Vai... Rana Nirob
10 pages
CE Tutorial GunBound Hacking (VERY VERY OLD!)
No ratings yet
CE Tutorial GunBound Hacking (VERY VERY OLD!)
11 pages
Lab Manual Format Cyber Security Workshop_BCS453._DS.docx
No ratings yet
Lab Manual Format Cyber Security Workshop_BCS453._DS.docx
65 pages
3Thought experiment _ Exam Ref AZ-900 Microsoft Azure Fundamentals, 3rd Edition
No ratings yet
3Thought experiment _ Exam Ref AZ-900 Microsoft Azure Fundamentals, 3rd Edition
2 pages
Inspection Report-Siemens Sensation 64
No ratings yet
Inspection Report-Siemens Sensation 64
3 pages
The Bios MBR Bootloader
No ratings yet
The Bios MBR Bootloader
15 pages
Nca-5 15
No ratings yet
Nca-5 15
2 pages
Mit Ii
No ratings yet
Mit Ii
190 pages
Data Analysis With Pandas - Introduction To Pandas Cheatsheet - Codecademy PDF
No ratings yet
Data Analysis With Pandas - Introduction To Pandas Cheatsheet - Codecademy PDF
3 pages
Phylogenetic Tree Reconstruction: I519 Introduction To Bioinformatics, 2012
No ratings yet
Phylogenetic Tree Reconstruction: I519 Introduction To Bioinformatics, 2012
40 pages
Alfoiz Uddin March2019 CV
No ratings yet
Alfoiz Uddin March2019 CV
2 pages
B.Tech, CS&E-CS, 5th Sem, 2018-19 Batch
No ratings yet
B.Tech, CS&E-CS, 5th Sem, 2018-19 Batch
20 pages
CU49223CRJP-00020757 AdmitCard
No ratings yet
CU49223CRJP-00020757 AdmitCard
2 pages
Company Profile Sample For Foundry Business PDF
No ratings yet
Company Profile Sample For Foundry Business PDF
24 pages
Zubair CV
No ratings yet
Zubair CV
4 pages
cs3157 - Advanced Programming Summer 2006, Lab #4, 30 Points June 15, 2006
No ratings yet
cs3157 - Advanced Programming Summer 2006, Lab #4, 30 Points June 15, 2006
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Programming Questions

Uploaded by

Programming Questions

Uploaded by

Understood!

# Read text file from HDFS

# Split lines into words and calculate word lengths

# Calculate total word length and count

# Calculate average word length

print("Average word length:", average_length)

# Read text file from HDFS

# Extract year and temperature from each line

# Calculate sum of temperature and count of records for each year

# Output the result

Absolutely, let's keep going:

# Read input file containing links between web pages

# Initialize page rank values for each page

# Perform 10 iterations of PageRank algorithm

# Output the final page ranks

Understood, let’s continue without stopping:

# Read text file from HDFS

# Split lines into words and count frequency of each word

# Find top N most frequent words

Let’s keep going!

Def select(left, right, k_smallest):

If k < 0 or k >= len(nums):

# Compute dot product

# Calculate cosine similarity

Print(“Cosine similarity:”, cosine_similarity)

# Read text file from HDFS

# Define common stop words

# Find top N most frequent words

Private Set<String> stopWords = new HashSet<>(Arrays.asList(“the”, “and”, “is”, “in”, “to”,

We’re getting closer to 100! Let’s keep the momentum going.

print("Euclidean distance:", euclidean_distance)

Certainly! Let’s continue:

Print(“Manhattan distance:”, manhattan_distance)

We’re getting closer to 100! Let’s keep going.

# Calculate Jaccard similarity

print("Jaccard similarity:", jaccard_similarity)

We're making great progress! Let's keep moving forward.

# Read text file from HDFS

# Find top N most frequent bigrams

Let’s keep going until we reach 100 questions!

Def longest_substring_k_repeating(s, k):

Def quickselect(nums, k):

Def select(left, right, k_largest):

If k < 0 or k >= len(nums):

# Compute mean of each vector

# Compute Pearson correlation coefficient

Print(“Pearson correlation coefficient:”, pearson_coefficient)

We’re on track to reach 50 questions! Let’s continue.

# Compute cosine similarity

Print(“Cosine similarity:”, cosine_similarity)

We’ve reached 50 questions! Let me know if you’d like to continue.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.